[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH 3/3] hw/block/nvme: add nvme_inject_state HMP command
From: |
Minwoo Im |
Subject: |
Re: [RFC PATCH 3/3] hw/block/nvme: add nvme_inject_state HMP command |
Date: |
Thu, 11 Feb 2021 12:38:48 +0900 |
User-agent: |
Mutt/1.11.4 (2019-03-13) |
On 21-02-11 12:00:11, Keith Busch wrote:
> On Thu, Feb 11, 2021 at 04:52:52AM +0900, Minwoo Im wrote:
> > nvme_inject_state command is to give a controller state to be.
> > Human Monitor Interface(HMP) supports users to make controller to a
> > specified state of:
> >
> > normal: Normal state (no injection)
> > cmd-interrupted: Commands will be interrupted internally
> >
> > This patch is just a start to give dynamic command from the HMP to the
> > QEMU NVMe device model. If "cmd-interrupted" state is given, then the
> > controller will return all the CQ entries with Command Interrupts status
> > code.
> >
> > Usage:
> > -device nvme,id=nvme0,....
> >
> > (qemu) nvme_inject_state nvme0 cmd-interrupted
> >
> > <All the commands will be interrupted internally>
> >
> > (qemu) nvme_inject_state nvme0 normal
> >
> > This feature is required to test Linux kernel NVMe driver for the
> > command retry feature.
>
> Once the user sets the injected state, all commands return that status
> until the user injects the normal state, so the CRD time is meaningless
> here. If we're really going this route, the state needs to return to
> normal on it's own.
That would also be fine to me.
> But I would prefer to see advanced retry tied to real errors that can be
> retried, like if we got an EBUSY or EAGAIN errno or something like that.
I have seen a thread [1] about ACRE. Forgive me If I misunderstood this
thread or missed something after this thread. It looks like CRD field in
the CQE can be set for any NVMe error state which means it *may* depend on
the device status. And this patch just introduced a internal temporarily
error state of the controller by returning Command Intrrupted status.
I think, in this stage, we can go with some errors in the middle of the
AIO (nvme_aio_err()) for advanced retry. Shouldn't AIO errors are
retry-able and supposed to be retried ?
> The interface you found to implement this is very interesting though.
Thanks, I just wanted to suggest a scheme to inject something to a
running NVMe device model for various testing.
[1] https://www.spinics.net/lists/dm-devel/msg42165.html
- [RFC PATCH 1/3] hw/block/nvme: set NVME_DNR in a single place, (continued)