Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_field

From:	Markus Armbruster
Subject:	Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields()
Date:	Thu, 03 Mar 2016 20:08:43 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

John Snow <address@hidden> writes:

> On 03/02/2016 04:31 AM, Kevin Wolf wrote:
>> Am 01.03.2016 um 17:51 hat John Snow geschrieben:
>>>
>>>
>>> On 03/01/2016 04:15 AM, Kevin Wolf wrote:
>>>> Am 01.03.2016 um 00:23 hat John Snow geschrieben:
>>>>>
>>>>>
>>>>> On 02/29/2016 09:36 AM, Kevin Wolf wrote:
>>>>>>
>>>>>> * Dirty bitmaps:
>>>>>>
>>>>>>   We're currently trying, and if I'm not mistaken failing, to move dirty
>>>>>>   bitmaps to the top. The (back then one) bitmap was first added to the
>>>>>>   list in Paolo's commit a9fc4408, with the following commit message:
>>>>>>
>>>>>>     While these should not be in use at the time a transaction is
>>>>>>     started, a command in the prepare phase of a transaction might have
>>>>>>     added them, so they need to be brought over.
>>>>>>
>>>>>>   At that point, there was no transactionable command that did this in
>>>>>>   the prepare phase. Today we have mirror and backup, but op blockers
>>>>>>   should prevent them from being mixed with snapshots in a single
>>>>>>   transaction, so I can't see how this change had any effect.
>>>>>>
>>>>>>   The reason why I think we're failing to move dirty bitmaps to the top
>>>>>>   today is that we're moving the head of the list to a different object
>>>>>>   without updating the prev link in the first element, so in any case
>>>>>>   it's buggy today.
>>>>>>
>>>>>>   I really would like to keep bitmaps on the BDS where they are, but
>>>>>>   unfortunately, we also have user-defined bitmaps by now, and if we
>>>>>>   change whether they stick with the top level, that's a change that is
>>>>>>   visible on the QMP interface.
>>>>>>
>>>>>>   On the other hand, the QMP interface clearly describes bitmaps as
>>>>>>   belonging to a node rather than a BB (you can use node-name, even with
>>>>>>   no BB attached), so moving them could be considered a bug, even if
>>>>>>   it is the existing behaviour.
>>>>>>
>>>>>
>>>>> Rule of thumb: follow the node such that the (node, name) pair given at
>>>>> creation time continues to work to identify the bitmap.

Yet another ignorant / forgetful question: what is a (node, name) pair?
Since it's "given at creation time", I guess it must be something in
QMP.

>>>> That's an option, though not how most (or probably all) QMP commands
>>>> work today. Currently, we have two types of commands:
>>>>
>>>> * BlockBackend level
>>>>   These accept only device names. Specifying a node-name is an error.
>>>>
>>>> * BlockDriverState level
>>>>   The preferred addressing for these is node-names, but they accept
>>>>   device names as aliases for the corresponding root node.
>>>>
>>>> Adding a third category that accepts both as well, but has a semantic
>>>> difference between device name and node-name might end up confusing. I'm
>>>> not totally against it if we think that it's the best way to represent
>>>> things that can be on either level, but maybe we should then deprecate
>>>> using device names as an alias for commands on the BDS level.
>>>>
>>>
>>> How about thinking of it like this: There is only one behavior for the
>>> bitmap creation command. The (node, name) pair used to create the bitmap
>>> will remain a valid address for the bitmap for the duration of its
>>> existence.
>>>
>>> In this way, it doesn't "do something different" for a device vs a node
>>> name. It always does the same thing. Of course, internally, we know a
>>> slightly more nuanced story, but that's just implementation detail.
>> 
>> The logic makes sense and I'm not saying that there is any inherent
>> inconsistence in it. The problem I see is in comparison with other,
>> existing commands that use a different logic (which in itself is
>> consistent as well, but both don't go well together).
>> 
>
> OK, understood.
>
> (I can't help but feel like this might be the only correct way to do
> bitmaps, though.)

I can't help but feel like letting device names name their roots was a
mistake.

It's simple enough if it's really just a funny way to name the root.

But semantic creep has set in, and we're now grappling with necessity
(real or imagined) to move things when the root changes.  That was never
*my* intention.  Or if it was, I was an shortsighted fool ;)

>>>>> For bitmaps created using device names this means floating to the top so
>>>>> that e.g. (drive0, bitmap0) continues to work and address what you think
>>>>> it does.
>>>>>
>>>>> For intermediary nodes or nodes explicitly referenced by their node name
>>>>> instead of their device name, this means it should stick to the node.
>>>>>
>>>>> All of the BlockDirtyBitmap commands use block_dirty_bitmap_lookup for
>>>>> the resolution, which identifies the BDS with bdrv_lookup_bs(node, node,
>>>>> NULL) and then tries to find the bitmap on that singular BDS with
>>>>> bdrv_find_dirty_bitmap(bs, name).
>>>>>
>>>>> For bitmaps created with device names, this means that if the bitmaps
>>>>> don't float up, we actually lose addressability to the bitmap.
>>>>>
>>>>> This may imply a new `bool float` property to be filled in at name
>>>>> resolution time.
>>>>
>>>> There are no floating BDS properties, nor is there, generally speaking,
>>>> a single top towards which they could be floating. Or to be precise,
>>>> there is one currently, but getting rid of that exactly the goal of my
>>>> work, because it is a precondition for allowing multiple BBs on a single
>>>> BDS.
>>>>
>>>
>>> Yeah, understood. In the case that a bitmap needs to float upwards, it
>>> needs to float towards the BB it was named after. I think it's possible
>>> to eliminate that ambiguity. If we follow the "We must be able to access
>>> the bitmap using the (node, name) pair" rule there's only one option for
>>> which node the bitmap floats to -- the root node under the BB it was
>>> created against.
>>>
>>> I think you are against storing pointers to parents in children nodes,
>>> so perhaps we can cache the name of the target the bitmap was created
>>> against and use that information to know if the bitmap needs to migrate
>>> to a new node or not -- and if so, where.
>> 
>> I'm not really only against pointers to parents (though these bring
>> additional problems), but really against anything that allows children
>> to know their parents. We want to change the parents without the
>> children having to care about it.
>> 
>> If something "floats on top", the right design seems to me to implement
>> it in the BlockBackend layer rather than implementing it in the BDS and
>> trying to adjust when the graph changes.
>> 
>
> Yeah, you're right. If we specify a BB name, we should just attach it to
> the BB and be done with it. No tomfoolery.

Re tomfoolery: automatically moving stuff from one BDS to another when
the user reconfigures the graph is a tar pit.  You need to decide what
the "right" move is, and that may involve guessing the user's intent.
There's just too much the user might want to do.

> This does differ from our current approach, which always resolves to a
> node. However, I think it does not change the semantics of the interface
> at all -- (device, name) pairs continue to work as expected. Where we
> attach it internally remains a bit of an implementation detail.
>
> Block query would need to be updated to print bitmaps attached to BBs,
> though, but there should be no client making programmatic use of the
> query function just yet. (At least, libvirt doesn't.)

Just to make sure I'm still with you: are you proposing to support
bitmaps attaching to both BDSes and BBs?

>>>> What this floating really means is that a function will be implemented
>>>> at BlockBackend level instead of BlockDriverState level.
>>>>
>>>> So what I can offer is forbidding to use device names for now,
>>>> restricting dirty bitmaps to the node level. And then you can later add
>>>> some BB level bitmap code to allow device names again.
>>>>
>>>
>>> Not entirely comfortable with "later," unless it's later in the same
>>> pull request. A little bit of breakage might be ok, but an outright
>>> regression might be hard to swallow.
>> 
>> Hm, fair enough. Changing dirty bitmaps to work on both layers is
>> probably out of scope for my series. Doesn't really matter though,
>> because...

Well, you fortuitously broke bitmaps moving to the top "in time for
2.5".  Nobody noticed.  I think that gives you license to break whatever
was broken then differently now, and then some.

>>>>>>   I can imagine use cases for both ways, so the interface that would
>>>>>>   make the most sense to me is to generally keep BDSes at their node,
>>>>>>   and to provide a QMP command to move them to a different one.
>>>>>>
>>>>>>   With compatibility in mind, this seems to be a reall tough one,
>>>>>>   though.
>>>>>
>>>>> I am sure that exceptions, as always, exist. If this breaks any behavior
>>>>> I am more than willing to say that this is simply a necessary bugfix and
>>>>> document accordingly. We do not even have any libvirt support yet, and
>>>>> half of the feature is still being actively concocted right now.
>>>>>
>>>>> I think we have some right to change behavior for this feature still.
>>>>
>>>> That's good news.
>>>>
>>>> Kevin
>>>>
>>>
>>> Post-script:
>>>
>>> It also occurs to me that as the BB graph becomes more fluid that the
>>> "bitmap name is unique per BDS" rule we have may become somewhat
>>> inadequate -- what if a bitmap is floated to another node where there is
>>> a conflict? should the whole operation fail? what if it is too late to fail?
>>>
>>> Maybe we need to enforce /globally/ unique bitmap names for user-created
>>> bitmaps to make sure this kind of tomfoolery doesn't occur...
>>>
>>> ...Well, which might then cause problems with .qcow2 persistence if two
>>> different qcow2s use the same bitmap names and are later used within the
>>> same QEMU instance.
>>>
>>> Maybe there needs to be a little bit of namespace mangling going on for
>>> that feature.
>> 
>> ...here you're really removing my doubts. You probably didn't intend it
>> this way, but what you describe is a very clear sign that letting
>> bitmaps float on the top is a bug and we need to get rid of it.
>> 
>> Seriously, if you need back references from BDS to BB, watch graph
>> modifications above the node, automatically creating new names for
>> bitmaps while taking a snapshot and similar things, then you're doing
>> something wrong. I don't even want to start thinking about persistent
>> bitmaps floating towards the top.
>> 
>
> Fair.
>
>> So I think what we need to do is simple: Keep the bitmaps at the node
>> for which they were created. For the cases where we must change what a
>> bitmap tracks, we can provide a QMP command to move it. If any name
>> change or anything like that is involved, it needs to be explicitly
>> requested by the management tool and not automatically be done by qemu
>> magic.

Makes sense to me.

>> Eric, does this look reasonable for libvirt?
>> 
>> Kevin
>> 
>
> I'll have to think about that -- The way we've been thinking about it is
> to assume that bitmaps created with a device name always "stick to the
> BB," which we've fuzzily implemented as always being attached to the
> root node under said BB. We have not been paying close attention to what
> happens when you provide a node name because we do not have any use
> cases for that yet.
>
> If we introduce the concept that -- if the user is not vigilant -- the
> bitmap may accidentally wind up on some intermediary node, we'll have to
> re-audit our work so far. I don't think this is the right way to go.

Well, when a user meddles with the graph, he better be vigilant.  I
figure the explicit graph-meddling commands will be arcane enough to
deter casual use.

I'm more scared of innocent-looking high-level operations that
reconfigure the graph under the hood.  If the user mixes them with his
own graph meddling, and doesn't fully understand what they do, vigilance
won't save him.

> I think if we want to say something like:
>
> - If the user provides a BB name, the bitmap attaches to the *BB*,
> - If the user provides a node name, the bitmap attaches to the node (and
> will always stay on THAT node. QMP commands to move the bitmap in this
> scenario are appropriate. As long as the node we attach to always keeps
> the name it had when we attached to it, I'm happy.

Now you confused me.  Example: user attaches bitmap to the BDS / node
with node name "Alice".  There's a QMP command to move the bitmap to
another node.  What does it move?  The bitmap?  The bitmap and the node
name?

>                                                    Any action (if any
> exist) that would change the node name need to move the bitmap to
> wherever the name went

Do we want to let users change a node's name?

>                        -- or just refuse to work on nodes with bitmaps
> until the user moves them first. either way.)
>
> that's sufficient to make me happy.
>
> The question is the suitability of the current command.

Ignorant me again: what's the current command?

> Is there a concrete argument against making bitmap-add work in this way?
> Do we *need* a new command, one for BBs and one for BDSs? We already
> overloaded the existing command to work with either -- what's the
> benefit to splitting it out now?
>
> --js

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Kevin Wolf, 2016/03/01
- Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), John Snow, 2016/03/01
  - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Kevin Wolf, 2016/03/02
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Eric Blake, 2016/03/02
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Kevin Wolf, 2016/03/02
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), John Snow, 2016/03/02
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Markus Armbruster <=
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), John Snow, 2016/03/03
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Markus Armbruster, 2016/03/04
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Kevin Wolf, 2016/03/04
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), John Snow, 2016/03/04
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Eric Blake, 2016/03/04
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Kevin Wolf, 2016/03/07
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Kevin Wolf, 2016/03/04
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Markus Armbruster, 2016/03/04
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), John Snow, 2016/03/04
    - Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields(), Max Reitz, 2016/03/05

Prev by Date: Re: [Qemu-block] [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers
Next by Date: Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields()
Previous by thread: Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields()
Next by thread: Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields()
Index(es):
- Date
- Thread