qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] block: Fix race in gluster_finish_aiocb


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH] block: Fix race in gluster_finish_aiocb
Date: Thu, 22 Aug 2013 15:27:35 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130805 Thunderbird/17.0.8

Il 22/08/2013 15:25, Bharata B Rao ha scritto:
> On Thu, Aug 22, 2013 at 01:15:59PM +0200, Paolo Bonzini wrote:
>> Il 22/08/2013 12:28, Bharata B Rao ha scritto:
>>> On Thu, Aug 22, 2013 at 12:00:48PM +0200, Paolo Bonzini wrote:
>>>> Il 22/08/2013 11:55, Bharata B Rao ha scritto:
>>>>> This was the first apporach I had. I used to abort when writes to pipe
>>>>> fail. But there were concerns raised about handling the failures 
>>>>> gracefully
>>>>> and hence we ended up doing all that error handling of completing the aio
>>>>> with -EIO, closing the pipe and making the disk inaccessible.
>>>>>
>>>>>>> Under what circumstances could it happen?
>>>>> Not very sure, I haven't seen that happening. I had to manually inject
>>>>> faults to test this error path and verify the graceful recovery.
>>>>
>>>> Looking at write(2), it looks like it is impossible
>>>>
>>>>        EAGAIN or EWOULDBLOCK
>>>>                can't happen, blocking file descriptor
>>>>
>>>>        EBADF, EPIPE
>>>>                shouldn't happen since the device is drained before
>>>>                calling qemu_gluster_close.
>>>>
>>>>        EDESTADDRREQ, EDQUOT, EFBIG, EIO, ENOSPC
>>>>                cannot happen for pipes
>>>>
>>>>        EFAULT
>>>>                abort would be fine
>>>
>>> In the case where we have separate system and data disks and if error 
>>> (EFAULT)
>>> happens for the data disk, don't we want to keep the VM up by gracefully
>>> disabling IO to the data disk ?
>>
>> EFAULT means the buffer address is invalid, I/O error would be EIO, but...
>>
>>> I remember this was one of the motivations to
>>> handle this failure.
>>
>> ... this write is on the pipe, not on a disk.
> 
> Right. Failure to complete the write on the pipe means that IO done to the
> disk didn't complete and hence to the VM it is essentially a disk IO failure.

The question is, can the write to the pipe actually fail?  Not just "in
practice not" according to the documented errors, it seems to me that it
cannot.

> That's the reason we return -EIO and make the disk inaccessible when this
> failure happens.
> 
> My question was if it is ok to abort the VM when IO to one of the disks fails 
> ?

Absolutely not, but here the code seems dead to me.

Paolo

> But, if you think it is not worth handling such errors then may be we can drop
> this elaborate and race-prone error recovery and just abort.
> 
> Regards,
> Bharata.
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]