qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 13/47] block: introduce block job error


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH 13/47] block: introduce block job error
Date: Mon, 06 Aug 2012 12:45:33 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0

Am 06.08.2012 11:44, schrieb Paolo Bonzini:
> Il 06/08/2012 11:29, Kevin Wolf ha scritto:
>>>>>> Block jobs aren't really different from guests in that respect. Maybe
>>>>>> the BB needs a second iostatus field that must explicitly be reset, and
>>>>>> the old one keeps doing the stupid thing for compatibility's sake.
>>>>
>>>> Or the iostatus for the target can just reside in the BlockJob... :)
>> That wouldn't fix the problem in more than a single instance...
> 
> Even if you have problems in more than one device, you can still fix
> them one at a time.
> 
> I think that we're fine with the current information.  In the long term
> we will add the failing blockdev name to the blockjob iostatus.

I think you misunderstood. What I was trying to say is that with the
same reasoning we'd need a field that doesn't automatically reset its
status on 'cont' not only for block jobs, but also for regular guest disks.

If you try fixing the problem by adding a field in BlockJob, it may well
be fixed for block jobs, but you still need to add it in the generic
place later so that regular disks are covered as well.

>>>> As much as I hate to invoke shortcuts, management may proceed without
>>>> human help only in the ENOSPC case, and ENOSPC can only happens on the
>>>> target.  Humans usually look at dmesg to find the source.
>> dmesg doesn't contain information about corrupted qcow2 images, Sheepdog
>> error codes from the server, etc.
> 
> True, guest dmesg doesn't help for block jobs.  (But network glitches
> could be in dmesg or other sources of monitoring information).

Guest or host dmesg? I thought we're talking about the host.

> But for block jobs your margins are small because you cannot take the VM
> offline.  So if you get an EIO you can just do three things: first,
> retry and see if it goes away (transient glitch); second, throw away the
> target and see if it goes away; third, raise white flag and ask the VM
> admin to cooperate, because the problem is likely in the source.

Yes, and the admin wants to know what happened, i.e. an accurate iostatus.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]