qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_field


From: John Snow
Subject: Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields()
Date: Tue, 1 Mar 2016 11:51:35 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0


On 03/01/2016 04:15 AM, Kevin Wolf wrote:
> Am 01.03.2016 um 00:23 hat John Snow geschrieben:
>>
>>
>> On 02/29/2016 09:36 AM, Kevin Wolf wrote:
>>> Hi all,
>>>
>>> I'm currently trying to get rid of bdrv_move_feature_fields(), so we can
>>> finally have more than one BB per BDS. Generally the way to do this is
>>> to move features from BDS and block.c to BB and block-backend.c.
>>> However, for two of the features I'm not sure about this:
>>>
>>> * Copy on Read:
>>>
>>>   When Jeff introduced bdrv_append() in commit 8802d1fd, the CoR flag
>>>   was already moved to the new top level when taking a snapshot. Does
>>>   anyone remember why it works like that? It doesn't seem to make a lot
>>>   of sense to me.
>>>
>>>   The use case for manually enabled CoR is to avoid reading data twice
>>>   from a slow remote image, so we want to save it to a local overlay,
>>>   say an ISO image accessed via HTTP to a local qcow2 overlay. When
>>>   taking a snapshot, we end up with a backing chain like this:
>>>
>>>       http <- local.qcow2 <- snap_overlay.qcow2
>>>
>>>   There is no point in performing copy on read from local.qcow2 into
>>>   snap_overlay.qcow2, we just want to keep copying data from the remote
>>>   source into local.qcow2.
>>>
>>>   Possible caveat: We would be writing to a backing file, but that's
>>>   similar to what some block jobs do, so if we design our op blockers to
>>>   cover this case, it should be fine.
>>>
>>>   I'm actually pretty sure that simply removing COR from the list, and
>>>   therefore changing the behaviour to not move it to the top any more,
>>>   is the right thing to do and could be considered a bug fix.
>>>
>>> * Dirty bitmaps:
>>>
>>>   We're currently trying, and if I'm not mistaken failing, to move dirty
>>>   bitmaps to the top. The (back then one) bitmap was first added to the
>>>   list in Paolo's commit a9fc4408, with the following commit message:
>>>
>>>     While these should not be in use at the time a transaction is
>>>     started, a command in the prepare phase of a transaction might have
>>>     added them, so they need to be brought over.
>>>
>>>   At that point, there was no transactionable command that did this in
>>>   the prepare phase. Today we have mirror and backup, but op blockers
>>>   should prevent them from being mixed with snapshots in a single
>>>   transaction, so I can't see how this change had any effect.
>>>
>>>   The reason why I think we're failing to move dirty bitmaps to the top
>>>   today is that we're moving the head of the list to a different object
>>>   without updating the prev link in the first element, so in any case
>>>   it's buggy today.
>>>
>>>   I really would like to keep bitmaps on the BDS where they are, but
>>>   unfortunately, we also have user-defined bitmaps by now, and if we
>>>   change whether they stick with the top level, that's a change that is
>>>   visible on the QMP interface.
>>>
>>>   On the other hand, the QMP interface clearly describes bitmaps as
>>>   belonging to a node rather than a BB (you can use node-name, even with
>>>   no BB attached), so moving them could be considered a bug, even if
>>>   it is the existing behaviour.
>>>
>>
>> Rule of thumb: follow the node such that the (node, name) pair given at
>> creation time continues to work to identify the bitmap.
> 
> That's an option, though not how most (or probably all) QMP commands
> work today. Currently, we have two types of commands:
> 
> * BlockBackend level
>   These accept only device names. Specifying a node-name is an error.
> 
> * BlockDriverState level
>   The preferred addressing for these is node-names, but they accept
>   device names as aliases for the corresponding root node.
> 
> Adding a third category that accepts both as well, but has a semantic
> difference between device name and node-name might end up confusing. I'm
> not totally against it if we think that it's the best way to represent
> things that can be on either level, but maybe we should then deprecate
> using device names as an alias for commands on the BDS level.
> 

How about thinking of it like this: There is only one behavior for the
bitmap creation command. The (node, name) pair used to create the bitmap
will remain a valid address for the bitmap for the duration of its
existence.

In this way, it doesn't "do something different" for a device vs a node
name. It always does the same thing. Of course, internally, we know a
slightly more nuanced story, but that's just implementation detail.

>> For bitmaps created using device names this means floating to the top so
>> that e.g. (drive0, bitmap0) continues to work and address what you think
>> it does.
>>
>> For intermediary nodes or nodes explicitly referenced by their node name
>> instead of their device name, this means it should stick to the node.
>>
>> All of the BlockDirtyBitmap commands use block_dirty_bitmap_lookup for
>> the resolution, which identifies the BDS with bdrv_lookup_bs(node, node,
>> NULL) and then tries to find the bitmap on that singular BDS with
>> bdrv_find_dirty_bitmap(bs, name).
>>
>> For bitmaps created with device names, this means that if the bitmaps
>> don't float up, we actually lose addressability to the bitmap.
>>
>> This may imply a new `bool float` property to be filled in at name
>> resolution time.
> 
> There are no floating BDS properties, nor is there, generally speaking,
> a single top towards which they could be floating. Or to be precise,
> there is one currently, but getting rid of that exactly the goal of my
> work, because it is a precondition for allowing multiple BBs on a single
> BDS.
> 

Yeah, understood. In the case that a bitmap needs to float upwards, it
needs to float towards the BB it was named after. I think it's possible
to eliminate that ambiguity. If we follow the "We must be able to access
the bitmap using the (node, name) pair" rule there's only one option for
which node the bitmap floats to -- the root node under the BB it was
created against.

I think you are against storing pointers to parents in children nodes,
so perhaps we can cache the name of the target the bitmap was created
against and use that information to know if the bitmap needs to migrate
to a new node or not -- and if so, where.

> What this floating really means is that a function will be implemented
> at BlockBackend level instead of BlockDriverState level.
> 
> So what I can offer is forbidding to use device names for now,
> restricting dirty bitmaps to the node level. And then you can later add
> some BB level bitmap code to allow device names again.
> 

Not entirely comfortable with "later," unless it's later in the same
pull request. A little bit of breakage might be ok, but an outright
regression might be hard to swallow.

>>>   I can imagine use cases for both ways, so the interface that would
>>>   make the most sense to me is to generally keep BDSes at their node,
>>>   and to provide a QMP command to move them to a different one.
>>>
>>>   With compatibility in mind, this seems to be a reall tough one,
>>>   though.
>>
>> I am sure that exceptions, as always, exist. If this breaks any behavior
>> I am more than willing to say that this is simply a necessary bugfix and
>> document accordingly. We do not even have any libvirt support yet, and
>> half of the feature is still being actively concocted right now.
>>
>> I think we have some right to change behavior for this feature still.
> 
> That's good news.
> 
> Kevin
> 

Post-script:

It also occurs to me that as the BB graph becomes more fluid that the
"bitmap name is unique per BDS" rule we have may become somewhat
inadequate -- what if a bitmap is floated to another node where there is
a conflict? should the whole operation fail? what if it is too late to fail?

Maybe we need to enforce /globally/ unique bitmap names for user-created
bitmaps to make sure this kind of tomfoolery doesn't occur...

...Well, which might then cause problems with .qcow2 persistence if two
different qcow2s use the same bitmap names and are later used within the
same QEMU instance.

Maybe there needs to be a little bit of namespace mangling going on for
that feature.

--js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]