qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] block: allow write-threshold on device name


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH] block: allow write-threshold on device name
Date: Wed, 10 Jun 2015 15:43:06 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Am 10.06.2015 um 15:07 hat Eric Blake geschrieben:
> On 06/10/2015 01:57 AM, Kevin Wolf wrote:
> 
> >>
> >> The statistic I'm interested in is the allocation of the block device
> >> (the host offset, aka wr_highest_offset 72482304 above), and NOT the
> >> usage pattern of the guest (the qcow2 protocol, wr_highest_offset
> >> 9129332224).  But bdrv_lookup_bs() finds the qcow2 protocol layer,
> >> rather than the intended backing file layer; likewise, query-block is
> >> only reporting write_threshold for the protocol layer.
> >>
> >> I'm wondering if, when a device name is given rather than a node name,
> >> it is safe to blindly follow the active layer down to its lowest member
> >> (or error out if there are more than one lower members, as in quorum),
> >> as that is the statistic that libvirt and upper layers really want ("am
> >> I about to exceed the allocation of my underlying storage?").  Likewise,
> >> on reporting, it is more useful to know the threshold of the backing
> >> layer if the qcow2 protocol layer does not have a threshold.  I'm
> >> playing with that idea before submitting a v2.
> > 
> > That is indeed what you need in your specific use case. However, qemu
> > shouldn't try to guess what management tools really want. It should
> > provide a clean interface that allows management tools to express
> > themselves what they want.
> 
> Well, I think that means I need to bite the bullet and teach libvirt to
> use node names before it can take advantage of this feature; at which
> point this idea of allowing a threshold on device name is no longer
> important.

Yes, I think libvirt needs to learn about node-names for this. And I'm
sure that very soon you'll find more uses for them anyway, so I prefer
doing the right thing now instead of adding a short-term hack for each
individual case in order to avoid the need for node-names.

> > The cleanest interface that I can think of is that you access exactly
> > the node whose name you specified. If we do any magic like going down
> > the chain (which chain? What do you do with things like quorum in the
> > path?), we make the interface inconsistent and if anyone really wants to
> > know the highest offset that the guest accessed on its virtual disk, it
> > wouldn't even be possible any more because we said that that's not what
> > a management tool is interested in.
> 
> My problem here is that libvirt tracks only a single <disk>, but that
> disk has two potential node names that need tracking (both the qcow2
> protocol, and the underlying file).

I'm afraid this sound as if you hadn't fully understood the consequences
of the blockdev work yet (even though I'm sure you do know them in
theory). Make it s/two/n/ and we'll talk about it.

Even today you can put filters like blkdebug/blkverify between the
format and the protocol to have more than two layers in the bs->file
chain. And you can use VMDK or Quorum to have many children instead of
just bs->file and bs->backing_hd.

These are setups that libvirt will increasingly need to understand.

> Furthermore, operations like
> snapshot creation, drive-mirror, and active block commit can change what
> the active layer is, and thus need another node name.
> 
> It would really make life easier if qemu could auto-assign node names
> (so that EVERY node has a name without libvirt having to invent two
> names per qcow2 file), and then give libvirt an easy way to query the
> node names in use (query-block should make it obvious what the full
> node-name tree is, so that libvirt can then pick out the node name it is
> interested in).

If you think we should pick up Jeff's patches for autogeneration of node
names again, I'm in favour of that. I think a few more reasons came up
recently why this would be useful.

> > Let's stay away from such magic, as much as we can. libvirt can just
> > specify a node-name for the protocol layer and use that.
> 
> Okay, I'll probably abandon this patch, then, but still work on
> something to make node names easier for libvirt to integrate with.

Hm, okay. I would find it nice to accept device names everywhere where a
node name is expected (just for aesthetic reasons), but I see that it
would probably remain unused, so abandoning the patch is okay with me.

Kevin

Attachment: pgpHduoxP19zG.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]