qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC][PATCH v2] block: add write threshold reporting fo


From: Kevin Wolf
Subject: Re: [Qemu-devel] [RFC][PATCH v2] block: add write threshold reporting for block devices
Date: Thu, 20 Nov 2014 12:34:28 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Am 20.11.2014 um 12:04 hat Stefan Hajnoczi geschrieben:
> On Thu, Nov 20, 2014 at 11:30:53AM +0100, Kevin Wolf wrote:
> > Am 17.11.2014 um 17:49 hat Stefan Hajnoczi geschrieben:
> > > On Fri, Nov 07, 2014 at 02:12:13PM +0100, Francesco Romani wrote:
> > > > +void bdrv_set_usage_threshold(BlockDriverState *bs, int64_t 
> > > > threshold_bytes)
> > > > +{
> > > > +    BlockDriverState *target_bs = bs;
> > > > +    if (bs->file) {
> > > > +        target_bs = bs->file;
> > > > +    }
> > > 
> > > Hmm...I think now I understand why you are trying to use bs->file.  This
> > > is an attempt to make image formats work with the threshold.
> > > 
> > > Unfortunately the BlockDriverState topology can be more complicated than
> > > just 1 level.
> > > 
> > > If we hardcode a strategy to traverse bs->file then it will work in most
> > > cases:
> > > 
> > >   while (bs->file) {
> > >       bs = bs->file;
> > >   }
> > > 
> > > But there are cases like VMDK extent files where a BlockDriverState
> > > actually has multiple children.
> > > 
> > > One way to solve this is to require that the management tool tells QEMU
> > > which exact BlockDriverState node the threshold applies to.  Then QEMU
> > > doesn't need any hardcoded policy.  But I'm not sure how realistic that
> > > it at the moment (whether management tools are uses node names for each
> > > node yet), so it may be best to hardcode the bs->file traversal that
> > > I've suggested.
> > > 
> > > Kevin: Do you agree?
> > 
> > I have a feeling that we would regret this in the long run because it
> > would allow only one special case of a general problem (watching a BDS).
> > This means that we'll get inconsistent APIs.
> > 
> > We're "only" talking about an optimisation here, even though a very
> > useful one, so I wouldn't easily make compromises here. We should
> > probably insist on using the node-name. Management tools need new code
> > anyway to make use of the new functionality, so they can implement
> > node-name support as well while they're at it.
> 
> Using node-name is the best thing to do.
> 
> My concern is just whether libvirt and other management tools are
> actually using node-name yet.

I don't think so. They also don't use blockdev-add yet.

But that's not a reason for us to add hacks that allow libvirt and other
management tools to avoid the proper APIs even in the future. They just
need to add support for node-names if they want to use new qemu features.
New features require support for new infrastructure, I think that's fair.

If they feel that representing complete BDS graphs in their code is too
much work for now, they can still keep temporary hacks with hardcoded
assumptions in their management code (like setting file.node-name and
ignoring other setups). At least it would be temporary hacks there; if
we did them in qemu, they would be a permanent API.

Kevin

Attachment: pgp8uxFC_efpC.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]