qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] vl.c/exit: pause cpus before closing block devi


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH] vl.c/exit: pause cpus before closing block devices
Date: Fri, 4 Aug 2017 10:58:51 +0100

On Mon, Jul 17, 2017 at 5:43 PM, John Snow <address@hidden> wrote:
> On 07/17/2017 06:26 AM, Dr. David Alan Gilbert wrote:
>> * Stefan Hajnoczi (address@hidden) wrote:
>>> On Thu, Jul 13, 2017 at 08:01:16PM +0100, Dr. David Alan Gilbert (git) 
>>> wrote:
>>>> From: "Dr. David Alan Gilbert" <address@hidden>
>>>>
>>>> There's a rare exit seg if the guest is accessing
>>>> IO during exit.
>>>> It's always hitting the atomic_inc(&bs->in_flight) with a NULL
>>>> bs. This was added recently in 99723548  but I don't see it
>>>> as the cause.
>>>>
>>>> Flip vl.c around so we pause the cpus before closing the block devices,
>>>> that way we shouldn't have anything trying to access them when
>>>> they're gone.
>>>>
>>>> This was originally Red Hat bz 
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1451015
>>>>
>>>> Signed-off-by: Dr. David Alan Gilbert <address@hidden>
>>>> Reported-by: Cong Li <address@hidden>
>>>>
>>>> --
>>>> This is a very rare race, I'll leave it running in a loop to see if
>>>> we hit anything else and to check this really fixes it.
>>>>
>>>> I do worry if there are other cases that can trigger this - e.g.
>>>> hot-unplug or ejecting a CD.
>>>>
>>>> ---
>>>>  vl.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> Reviewed-by: Stefan Hajnoczi <address@hidden>
>>
>> Thanks;  and the test I left running seems solid - ~12k runs
>> over the weekend with no seg.
>>
>> Dave
>>
>> --
>> Dr. David Alan Gilbert / address@hidden / Manchester, UK
>>
>
> the root cause of this bug is related to this as well:
> https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg02945.html
>
> From commit 99723548 we started assuming (incorrectly?) that blk_
> functions always WILL have an attached BDS, but this is not always true,
> for instance, flushing the cache from an empty CDROM.
>
> Paolo, can we move the flight counter increment outside of the
> block-backend layer, is that safe?

I think the bdrv_inc_in_flight(blk_bs(blk)) needs to be fixed
regardless of the throttling timer issue discussed below.  BB cannot
assume that the BDS graph is non-empty.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]