qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v1 0/6]: block: Add I/O status support


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v1 0/6]: block: Add I/O status support
Date: Fri, 23 Sep 2011 10:55:39 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

Kevin Wolf <address@hidden> writes:

> Am 19.09.2011 16:09, schrieb Luiz Capitulino:
>> On Mon, 19 Sep 2011 15:40:06 +0200
>> Kevin Wolf <address@hidden> wrote:
>> 
>>> Am 01.09.2011 20:37, schrieb Luiz Capitulino:
>>>> This series adds support to the block layer to keep track of devices'
>>>> I/O status. That information is also made available in QMP and HMP.
>>>>
>>>> The goal here is to allow management applications that miss the
>>>> BLOCK_IO_ERROR event to able to query the VM to determine if any device has
>>>> caused the VM to stop and which device caused it.
>>>>
>>>> Please, note that this series depends on the following series:
>>>>
>>>>  o [PATCH v3 0/8]: Introduce the RunState type
>>>>  o http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg00118.html
>>>>
>>>> And to be able to properly test it you'll also need:
>>>>
>>>>  o [PATCH 0/3] qcow2/coroutine fixes
>>>>  o http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg00074.html
>>>>
>>>> Here's an HMP example:
>>>>
>>>>   (qemu) info status 
>>>>   VM status: paused (io-error)
>>>>   (qemu) info block
>>>>   ide0-hd0: removable=0 io-status=ok file=disks/test2.img ro=0 drv=qcow2 
>>>> encrypted=0
>>>>   ide0-hd1: removable=0 io-status=nospace file=/dev/vg_doriath/kvmtest 
>>>> ro=0 drv=qcow2 encrypted=0
>>>>   ide1-cd0: removable=1 locked=0 io-status=ok [not inserted]
>>>>   floppy0: removable=1 locked=0 [not inserted]
>>>>   sd0: removable=1 locked=0 [not inserted]
>>>>
>>>> The "info status" command shows that the VM is stopped due to an I/O error.
>>>> By issuing "info block" it's possible to determine that the 'ide0-hd1' 
>>>> device
>>>> caused the error, which turns out to be due to no space.
>>>
>>> Looks like I didn't reply here yet?
>> 
>> No, you didn't.
>> 
>>> I still don't quite like that the devices are involved, but their part
>>> is minimal and it makes the implementation much easier, so for me that's
>>> acceptable.
>> 
>> Suggestions on better ways of implementing this are welcome! :)
>
> I don't have one. :-)
>
> Surely it would be possible to do everything in block.c, but I see that
> this would make things much more complicated (would need to get an AIO
> callback added to the chain that can save the error code). It's not
> worth the trouble.

The block layer necessarily detects errors in a huge number of places.

It also has a rich and complex interface to device models, with error
reporting in many places.

virtio-blk, ide and scsi-disk each have a single error handler to handle
block layer errors.

For better or worse, these are the only block layer error "chokepoints"
we have now, and therefore the only easy places to set BDS I/O status.
But they're still wrong.

The block layer shouldn't require device models to tell it what it
already knows.

Now, I don't want to block your series just because it adds this wart.
But the wart should be documented in the code.

Also document that any device model implementing werror=stop|enospc or
rerror=stop must update I/O status.  Extra points if you find a way to
enforce it instead.

Apropos enforcing.  Currently, -drive accepts any werror and rerror
action with if={ide,virtio,scsi,none}.  We rely on device models not
implementing an action to check and fail during initialization.
scsi-generic and the floppy devices do.  All the other qdevified devices
don't, and that's broken.

Are werror and rerror really host state?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]