qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] qed: fix bdrv_qed_drain


From: Kevin Wolf
Subject: Re: [Qemu-block] [PATCH] qed: fix bdrv_qed_drain
Date: Tue, 8 Mar 2016 10:58:47 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Am 07.03.2016 um 21:56 hat Stefan Hajnoczi geschrieben:
> On Mon, Mar 07, 2016 at 05:57:41PM +0100, Kevin Wolf wrote:
> > Am 23.02.2016 um 14:54 hat Paolo Bonzini geschrieben:
> > > 
> > > 
> > > On 23/02/2016 13:49, Fam Zheng wrote:
> > > > On Tue, 02/23 11:43, Paolo Bonzini wrote:
> > > >>
> > > >>
> > > >> On 23/02/2016 06:57, Fam Zheng wrote:
> > > >>>>>> +        qed_cancel_need_check_timer(s);
> > > >>>>>> +        qed_need_check_timer_cb(s);
> > > >>>>>> +    }
> > > >>>>>
> > > >>>>> What if an allocating write is queued (the else branch case)? Its 
> > > >>>>> completion
> > > >>>>> will be in bdrv_drain and it could arm the need_check_timer which 
> > > >>>>> is wrong.
> > > >>>>>
> > > >>>>> We need to drain the allocating_write_reqs queue before checking 
> > > >>>>> the timer.
> > > >>>>
> > > >>>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> > > >>>> chicken-and-egg problem.
> > > >>>
> > > >>> Maybe use an aio_poll loop before the if?
> > > >>
> > > >> That would not change the fact that you're reimplementing bdrv_drain
> > > >> inside bdrv_qed_drain.
> > > > 
> > > > But it fulfills the contract of .bdrv_drain. This is the easy way, the 
> > > > hard way
> > > > would be iterating through the allocating_write_reqs list and process 
> > > > reqs one
> > > > by one synchronously, which still involves aio_poll indirectly.
> > > 
> > > The easy way would be better then.
> > > 
> > > Stefan, any second opinion?
> > 
> > What's the status here? It would be good to have qed not segfaulting all
> > the time.
> 
> I think the timer concept itself is troublesome.  A radical approach but
> limited to changing QED itself is to drop the timer and instead keep a
> timestamp for the last allocating write request.  Next time a write
> request (allocating or rewriting) is about to complete, do the flush and
> clear the need check flag as part of the write request (if 5 seconds
> have passed since the timestamp).

Isn't that kind of backwards, though, because it's very likely that
after some new activity a second write request will come in soon and
mark the image dirty again?

Apart from that, doesn't it fail to provide the intended effect, that
the image doesn't stay dirty for a long time and a repair isn't usually
needed after a crash?

I think rather than doing this, I'd just remove the mechanism altogether
and instead mark the image clean when the guest flushes the disk cache.

Kevin

> Because the need check flag clearing is part of the write request
> completion, it will integrate properly with drain.
> 
> Stefan


Attachment: pgpbOdriWurSe.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]