[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines
From: |
Fam Zheng |
Subject: |
Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines |
Date: |
Tue, 8 Sep 2015 21:20:35 +0800 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Tue, 09/08 12:49, Kevin Wolf wrote:
> Am 08.09.2015 um 12:20 hat Fam Zheng geschrieben:
> > On Tue, 09/08 12:11, Kevin Wolf wrote:
> > > Am 08.09.2015 um 11:20 hat Fam Zheng geschrieben:
> > > > [Cc'ing address@hidden
> > > >
> > > > On Tue, 09/08 11:00, Denis V. Lunev wrote:
> > > > > To avoid such situation this patchset introduces patch per-drive
> > > > > option
> > > > > "disk-deadlines=on|off" which is unset by default.
> > > >
> > > > The general idea sounds very nice. Thanks!
> > > >
> > > > Should we allow user configuration on the timeout? If so, the option
> > > > should be
> > > > something like "timeout-seconds=0,1,2...". Also I think we could use
> > > > werror
> > > > and rerror to control the handling policy (whether to
> > > > ignore/report/stop on
> > > > timeout).
> > >
> > > Yes, I think the timeout needs to be configurable. However, the only
> > > action that makes sense is stop. Everything else would be unsafe because
> > > the running request could still complete at a later point.
> >
> > What if the timeout happens on a quorum child? The management can replace
> > it
> > transparently without stopping the VM.
>
> This is getting tricky...
>
> I'll try this: We need to attribute timed out requests to a specific BDS.
> A user of a BlockBackend can run if all of its (recursive) children
> don't have timed out requests. So if the only thing that is blocked is a
> BDS used for an NBD server, but it isn't used by the guest, the guest
> can keep running. The same way, after removing a bad quorum child, the
> guest can be continued again.
>
> Somehow we must make sure that timeouts are propagated through the BDS
> tree (do we need parent notifiers?), and that at the same time the
> quorum BDS's timeout status is updated when the bad child is removed.
IIUC the implementation in this series already handles this cleanly with an RBT
data structure, without messing with BDS tree hierarchy.
>
> The trickier part might actually be to remove a BDS from quorum while a
> request is still in flight. The traditional approach is bdrv_drain(),
> but that won't work here. We want to remove the child while quorum has
> still a request pending on it.
I think the point here is avoiding accessing dangling pointer, which shouldn't
be too hard with BDS's reference counting.
Fam
- Re: [Qemu-devel] [PATCH 4/5] disk_deadlines: add control of requests time expiration, (continued)
[Qemu-devel] [PATCH 5/5] disk_deadlines: add info disk-deadlines option, Denis V. Lunev, 2015/09/08
Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines, Vasiliy Tolstov, 2015/09/08
Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines, Fam Zheng, 2015/09/08
Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines, Paolo Bonzini, 2015/09/08
Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines, Kevin Wolf, 2015/09/08
Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines, Stefan Hajnoczi, 2015/09/08