qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_field


From: John Snow
Subject: Re: [Qemu-block] block: Dirty bitmaps and COR in bdrv_move_feature_fields()
Date: Fri, 4 Mar 2016 14:53:44 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

On 03/04/2016 02:49 AM, Markus Armbruster wrote:
> John Snow <address@hidden> writes:
> 
>> On 03/03/2016 02:08 PM, Markus Armbruster wrote:
>>> John Snow <address@hidden> writes:
>>>
>>>> On 03/02/2016 04:31 AM, Kevin Wolf wrote:
>>>>> Am 01.03.2016 um 17:51 hat John Snow geschrieben:
>>>>>>
>>>>>>
>>>>>> On 03/01/2016 04:15 AM, Kevin Wolf wrote:
>>>>>>> Am 01.03.2016 um 00:23 hat John Snow geschrieben:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 02/29/2016 09:36 AM, Kevin Wolf wrote:
>>>>>>>>>
>>>>>>>>> * Dirty bitmaps:
>>>>>>>>>
>>>>>>>>>   We're currently trying, and if I'm not mistaken failing, to move 
>>>>>>>>> dirty
>>>>>>>>>   bitmaps to the top. The (back then one) bitmap was first added to 
>>>>>>>>> the
>>>>>>>>>   list in Paolo's commit a9fc4408, with the following commit message:
>>>>>>>>>
>>>>>>>>>     While these should not be in use at the time a transaction is
>>>>>>>>>     started, a command in the prepare phase of a transaction might 
>>>>>>>>> have
>>>>>>>>>     added them, so they need to be brought over.
>>>>>>>>>
>>>>>>>>>   At that point, there was no transactionable command that did this in
>>>>>>>>>   the prepare phase. Today we have mirror and backup, but op blockers
>>>>>>>>>   should prevent them from being mixed with snapshots in a single
>>>>>>>>>   transaction, so I can't see how this change had any effect.
>>>>>>>>>
>>>>>>>>>   The reason why I think we're failing to move dirty bitmaps to the 
>>>>>>>>> top
>>>>>>>>>   today is that we're moving the head of the list to a different 
>>>>>>>>> object
>>>>>>>>>   without updating the prev link in the first element, so in any case
>>>>>>>>>   it's buggy today.
>>>>>>>>>
>>>>>>>>>   I really would like to keep bitmaps on the BDS where they are, but
>>>>>>>>>   unfortunately, we also have user-defined bitmaps by now, and if we
>>>>>>>>>   change whether they stick with the top level, that's a change that 
>>>>>>>>> is
>>>>>>>>>   visible on the QMP interface.
>>>>>>>>>
>>>>>>>>>   On the other hand, the QMP interface clearly describes bitmaps as
>>>>>>>>>   belonging to a node rather than a BB (you can use node-name, even 
>>>>>>>>> with
>>>>>>>>>   no BB attached), so moving them could be considered a bug, even if
>>>>>>>>>   it is the existing behaviour.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Rule of thumb: follow the node such that the (node, name) pair given at
>>>>>>>> creation time continues to work to identify the bitmap.
>>>
>>> Yet another ignorant / forgetful question: what is a (node, name) pair?
>>> Since it's "given at creation time", I guess it must be something in
>>> QMP.
>>>
>>
>> The node=[node-name|device] and name=[bitmap-name] parameters given at
>> QMP creation. You need both parameters to address the bitmap in the
>> future, so I consider them an address tuple of sorts.
> 
> All right, it's (node-name, bitmap-name).
> 
> Why is bitmap-name insufficient?  It's chosen by the user, so it could
> be made unique.
> 

Currently it's unique only to the node it was attached to. We could add
a global table, but what do we do when a user uses two pre-supplied
drive images that both use "bitmap0" as their bitmap? Reject them?
Rename them?

It wasn't my design choice, but my back-rationalization is that it was
perceived as easier to guarantee uniqueness per-node than it was to
guarantee universally.

Making the names unique per-node, per-graph, per-VM all have different
pros/cons and challenges. Which way is best is currently unclear to me.

>>>>>>> That's an option, though not how most (or probably all) QMP commands
>>>>>>> work today. Currently, we have two types of commands:
>>>>>>>
>>>>>>> * BlockBackend level
>>>>>>>   These accept only device names. Specifying a node-name is an error.
>>>>>>>
>>>>>>> * BlockDriverState level
>>>>>>>   The preferred addressing for these is node-names, but they accept
>>>>>>>   device names as aliases for the corresponding root node.
>>>>>>>
>>>>>>> Adding a third category that accepts both as well, but has a semantic
>>>>>>> difference between device name and node-name might end up confusing. I'm
>>>>>>> not totally against it if we think that it's the best way to represent
>>>>>>> things that can be on either level, but maybe we should then deprecate
>>>>>>> using device names as an alias for commands on the BDS level.
>>>>>>>
>>>>>>
>>>>>> How about thinking of it like this: There is only one behavior for the
>>>>>> bitmap creation command. The (node, name) pair used to create the bitmap
>>>>>> will remain a valid address for the bitmap for the duration of its
>>>>>> existence.
>>>>>>
>>>>>> In this way, it doesn't "do something different" for a device vs a node
>>>>>> name. It always does the same thing. Of course, internally, we know a
>>>>>> slightly more nuanced story, but that's just implementation detail.
>>>>>
>>>>> The logic makes sense and I'm not saying that there is any inherent
>>>>> inconsistence in it. The problem I see is in comparison with other,
>>>>> existing commands that use a different logic (which in itself is
>>>>> consistent as well, but both don't go well together).
>>>>>
>>>>
>>>> OK, understood.
>>>>
>>>> (I can't help but feel like this might be the only correct way to do
>>>> bitmaps, though.)
>>>
>>> I can't help but feel like letting device names name their roots was a
>>> mistake.
>>>
>>
>> Had I realized it would lead to this, I would not have gone forward with
>> it. I followed the path of least resistance for how to identify the
>> drive structural objects which were very much in flux at the time.
>>
>> And I was very new. So was BlockBackend.
>>
>> Problems Problems.
>>
>> (Hilariously, I think that using the "either drive OR node-name but not
>> both" approach that everyone hated would be precisely the most
>> semantically accurate right now.)
> 
> My memory might be busy falsifying itself to support the idea that I've
> always been right, but to the degree it still serves: I don't remember
> hating the *concept*, but I do remember hating its realization as "must
> give exactly one of two arguments" constraint (ugly, not
> introspectable), and doubts about its realization as single, overloaded
> argument.
> 

You're off the hook, because I won't go look it up. I'm not that motivated.

> It's not the first time we've painted ourselves into a corner that way.
> As long as we habitually cast every new interface in stone at first
> release, we're setting us up for painful and expensive failure.  We need
> to accept and embrace the fact that complex features take time to
> analyze and implement.  Start with an experimental interface, and let
> things bake properly.  Bless the interface it when the feature is ready
> for use.  We'll get "it's complex" and "it's ready" wrong, too, but
> hopefully less often.
> 

Yeah, we should have done x-block-dirty-bitmap-add until we had all of
the features fully cooked -- though in this particular case, the new
wrinkle came from elsewhere in QEMU and not a new case prompted by
bitmaps themselves.

Lessons for the future ...

>>> It's simple enough if it's really just a funny way to name the root.
>>>
>>> But semantic creep has set in, and we're now grappling with necessity
>>> (real or imagined) to move things when the root changes.  That was never
>>> *my* intention.  Or if it was, I was an shortsighted fool ;)
>>>
>>>>>>>> For bitmaps created using device names this means floating to the top 
>>>>>>>> so
>>>>>>>> that e.g. (drive0, bitmap0) continues to work and address what you 
>>>>>>>> think
>>>>>>>> it does.
>>>>>>>>
>>>>>>>> For intermediary nodes or nodes explicitly referenced by their node 
>>>>>>>> name
>>>>>>>> instead of their device name, this means it should stick to the node.
>>>>>>>>
>>>>>>>> All of the BlockDirtyBitmap commands use block_dirty_bitmap_lookup for
>>>>>>>> the resolution, which identifies the BDS with bdrv_lookup_bs(node, 
>>>>>>>> node,
>>>>>>>> NULL) and then tries to find the bitmap on that singular BDS with
>>>>>>>> bdrv_find_dirty_bitmap(bs, name).
>>>>>>>>
>>>>>>>> For bitmaps created with device names, this means that if the bitmaps
>>>>>>>> don't float up, we actually lose addressability to the bitmap.
>>>>>>>>
>>>>>>>> This may imply a new `bool float` property to be filled in at name
>>>>>>>> resolution time.
>>>>>>>
>>>>>>> There are no floating BDS properties, nor is there, generally speaking,
>>>>>>> a single top towards which they could be floating. Or to be precise,
>>>>>>> there is one currently, but getting rid of that exactly the goal of my
>>>>>>> work, because it is a precondition for allowing multiple BBs on a single
>>>>>>> BDS.
>>>>>>>
>>>>>>
>>>>>> Yeah, understood. In the case that a bitmap needs to float upwards, it
>>>>>> needs to float towards the BB it was named after. I think it's possible
>>>>>> to eliminate that ambiguity. If we follow the "We must be able to access
>>>>>> the bitmap using the (node, name) pair" rule there's only one option for
>>>>>> which node the bitmap floats to -- the root node under the BB it was
>>>>>> created against.
>>>>>>
>>>>>> I think you are against storing pointers to parents in children nodes,
>>>>>> so perhaps we can cache the name of the target the bitmap was created
>>>>>> against and use that information to know if the bitmap needs to migrate
>>>>>> to a new node or not -- and if so, where.
>>>>>
>>>>> I'm not really only against pointers to parents (though these bring
>>>>> additional problems), but really against anything that allows children
>>>>> to know their parents. We want to change the parents without the
>>>>> children having to care about it.
>>>>>
>>>>> If something "floats on top", the right design seems to me to implement
>>>>> it in the BlockBackend layer rather than implementing it in the BDS and
>>>>> trying to adjust when the graph changes.
>>>>>
>>>>
>>>> Yeah, you're right. If we specify a BB name, we should just attach it to
>>>> the BB and be done with it. No tomfoolery.
>>>
>>> Re tomfoolery: automatically moving stuff from one BDS to another when
>>> the user reconfigures the graph is a tar pit.  You need to decide what
>>> the "right" move is, and that may involve guessing the user's intent.
>>> There's just too much the user might want to do.
>>>
>>>> This does differ from our current approach, which always resolves to a
>>>> node. However, I think it does not change the semantics of the interface
>>>> at all -- (device, name) pairs continue to work as expected. Where we
>>>> attach it internally remains a bit of an implementation detail.
>>>>
>>>> Block query would need to be updated to print bitmaps attached to BBs,
>>>> though, but there should be no client making programmatic use of the
>>>> query function just yet. (At least, libvirt doesn't.)
>>>
>>> Just to make sure I'm still with you: are you proposing to support
>>> bitmaps attaching to both BDSes and BBs?
>>>
>>
>> Well, you'd never have to move them.
>>
>> (Yes, that is what I am proposing.)
> 
> Got it.
> 
>> From the viewpoint of: BlockBackends and BlockDriverState objects
>> provide different semantic relations to the drive as a whole. Graph-wide
>> relationships belong with the BB, driver-specific or filter-specific
>> relationships belong with the BDS.
>>
>> Arguably, we want and need bitmaps in both scenarios, no? We're fuzzing
>> drive-wide semantics currently by attaching to the root node, but if
>> that bit of hackery goes away, we'll have to go for the Real McCoy.
> 
> Can you give a use case for a bitmap attached to a BDS, and another one
> for BB?
> 

BB: Drive-wide dirty bitmap tracking. We don't really care where these
writes or going or how, but we know something somewhere changed and this
is data that needs to be copied out to create a worthwhile incremental
backup. This is the vanilla feature.

BDS: layer-specific dirty tracking: We have our drive subdivided in a
specialized way and we want to track writes only to one particular layer
-- either for debugging or analysis reasons, or to create differentials
of individual layers.

I don't have an /explicit/ use case for the BDS version, so if I lose
support for this for 2.6, I won't lose any sleep. It's the drive-wide
version I want to preserve the most. (Without worrying about semantic or
literal API compatibility -- I just want to keep the feature around.)

As the graph manipulation becomes more robust, use cases may evolve (so
let's keep the door unlocked) but I am ill-equipped to reason
comprehensively about those cases now as our filtering/job situation
with dynamic reconfigurations is not a solved problem yet -- so it's
hard to actually say what the use will be.

(So I'm okay with dropping node support for now to avoid making a
mistake in how we implement it.)

>>>>>>> What this floating really means is that a function will be implemented
>>>>>>> at BlockBackend level instead of BlockDriverState level.
>>>>>>>
>>>>>>> So what I can offer is forbidding to use device names for now,
>>>>>>> restricting dirty bitmaps to the node level. And then you can later add
>>>>>>> some BB level bitmap code to allow device names again.
>>>>>>>
>>>>>>
>>>>>> Not entirely comfortable with "later," unless it's later in the same
>>>>>> pull request. A little bit of breakage might be ok, but an outright
>>>>>> regression might be hard to swallow.
>>>>>
>>>>> Hm, fair enough. Changing dirty bitmaps to work on both layers is
>>>>> probably out of scope for my series. Doesn't really matter though,
>>>>> because...
>>>
>>> Well, you fortuitously broke bitmaps moving to the top "in time for
>>> 2.5".  Nobody noticed.  I think that gives you license to break whatever
>>> was broken then differently now, and then some.
>>>
>>
>> Yeah, there's no clear "what to do on reconfiguration" behavior. It
>> hasn't been consistent for any number of releases. (...All two!)
>>
>> What does work, though, is using device names as targets to attach
>> bitmaps to, and I would hate to see that break if I can avoid it.
> 
> Is anybody using it in anger, yet?  If yes, can we ask for forgiveness?
> 

"Probably nobody is using it," but explained in my previous reply (below
here) is that the feature currently only works with drives and _not_
nodes in its entirety, so literally the only thing I don't want to break
is the "device name" semantics, because that's the only way to even use
the feature right now.

> I'm not saying you *should* break things.  I'm merely challenging the
> "must not break anything" dogma.  Let's be pragmatic.
> 

Am I not being a pragmatist? I thought that we could keep the command
but refine the semantics, which is an API compatibility break. Looks
like there's some pushback against that, so I can move on to breaking
things even more. I am being conservative, perhaps, but I am avoiding
staking solid ground.

If we need to REALLY break the commands, though, I'm fine with that.
Really! I probably look like I'm putting up a big stink here, but I just
want to make sure everyone is on the same page and operating with the
same knowledge before we bust out the scalpel and make more changes
we'll hate and regret come 2.7.

>>>>>>>>>   I can imagine use cases for both ways, so the interface that would
>>>>>>>>>   make the most sense to me is to generally keep BDSes at their node,
>>>>>>>>>   and to provide a QMP command to move them to a different one.
>>>>>>>>>
>>>>>>>>>   With compatibility in mind, this seems to be a reall tough one,
>>>>>>>>>   though.
>>>>>>>>
>>>>>>>> I am sure that exceptions, as always, exist. If this breaks any 
>>>>>>>> behavior
>>>>>>>> I am more than willing to say that this is simply a necessary bugfix 
>>>>>>>> and
>>>>>>>> document accordingly. We do not even have any libvirt support yet, and
>>>>>>>> half of the feature is still being actively concocted right now.
>>>>>>>>
>>>>>>>> I think we have some right to change behavior for this feature still.
>>>>>>>
>>>>>>> That's good news.
>>>>>>>
>>>>>>> Kevin
>>>>>>>
>>>>>>
>>>>>> Post-script:
>>>>>>
>>>>>> It also occurs to me that as the BB graph becomes more fluid that the
>>>>>> "bitmap name is unique per BDS" rule we have may become somewhat
>>>>>> inadequate -- what if a bitmap is floated to another node where there is
>>>>>> a conflict? should the whole operation fail? what if it is too late to 
>>>>>> fail?
>>>>>>
>>>>>> Maybe we need to enforce /globally/ unique bitmap names for user-created
>>>>>> bitmaps to make sure this kind of tomfoolery doesn't occur...
>>>>>>
>>>>>> ...Well, which might then cause problems with .qcow2 persistence if two
>>>>>> different qcow2s use the same bitmap names and are later used within the
>>>>>> same QEMU instance.
>>>>>>
>>>>>> Maybe there needs to be a little bit of namespace mangling going on for
>>>>>> that feature.
>>>>>
>>>>> ...here you're really removing my doubts. You probably didn't intend it
>>>>> this way, but what you describe is a very clear sign that letting
>>>>> bitmaps float on the top is a bug and we need to get rid of it.
>>>>>
>>>>> Seriously, if you need back references from BDS to BB, watch graph
>>>>> modifications above the node, automatically creating new names for
>>>>> bitmaps while taking a snapshot and similar things, then you're doing
>>>>> something wrong. I don't even want to start thinking about persistent
>>>>> bitmaps floating towards the top.
>>>>>
>>>>
>>>> Fair.
>>>>
>>>>> So I think what we need to do is simple: Keep the bitmaps at the node
>>>>> for which they were created. For the cases where we must change what a
>>>>> bitmap tracks, we can provide a QMP command to move it. If any name
>>>>> change or anything like that is involved, it needs to be explicitly
>>>>> requested by the management tool and not automatically be done by qemu
>>>>> magic.
>>>
>>> Makes sense to me.
>>>
>>>>> Eric, does this look reasonable for libvirt?
>>>>>
>>>>> Kevin
>>>>>
>>>>
>>>> I'll have to think about that -- The way we've been thinking about it is
>>>> to assume that bitmaps created with a device name always "stick to the
>>>> BB," which we've fuzzily implemented as always being attached to the
>>>> root node under said BB. We have not been paying close attention to what
>>>> happens when you provide a node name because we do not have any use
>>>> cases for that yet.
>>>>
>>>> If we introduce the concept that -- if the user is not vigilant -- the
>>>> bitmap may accidentally wind up on some intermediary node, we'll have to
>>>> re-audit our work so far. I don't think this is the right way to go.
>>>
>>> Well, when a user meddles with the graph, he better be vigilant.  I
>>> figure the explicit graph-meddling commands will be arcane enough to
>>> deter casual use.
>>>
>>
>> If the user is doing explicit graph-meddling, yes. If it's implicit
>> graph meddling as a result from a macro command (A job, for example) I
>> think it's wrong to expect fiddling.
>>
>> So, for example, if we create strange intermediary nodes for the
>> purposes of mirroring, backups or so on, I am fervently against
>> requiring the user to manage bitmap placement explicitly in reaction to
>> implementation detail. The user cannot hope to get this kind of
>> configuration correct if they don't even know what's going on in the
>> job's black box.
> 
> Agreed.
> 
>> However, if the user themselves is using low-level QMP primitives to
>> fiddle with the graph, great. You have opened Pandora's Turing Machine
>> and you are on your own. (Bitmap manipulation primitives here will of
>> course be useful.)
>>
>>> I'm more scared of innocent-looking high-level operations that
>>> reconfigure the graph under the hood.  If the user mixes them with his
>>> own graph meddling, and doesn't fully understand what they do, vigilance
>>> won't save him.
>>>
>>>> I think if we want to say something like:
>>>>
>>>> - If the user provides a BB name, the bitmap attaches to the *BB*,
>>>> - If the user provides a node name, the bitmap attaches to the node (and
>>>> will always stay on THAT node. QMP commands to move the bitmap in this
>>>> scenario are appropriate. As long as the node we attach to always keeps
>>>> the name it had when we attached to it, I'm happy.
>>>
>>> Now you confused me.  Example: user attaches bitmap to the BDS / node
>>> with node name "Alice".  There's a QMP command to move the bitmap to
>>> another node.  What does it move?  The bitmap?  The bitmap and the node
>>> name?
>>>
>>
>> User creates bitmap "foobar" on BDS "Alice". The user-visible address
>> for accessing this bitmap is now and forever (Alice, foobar).
> 
> What if the BDS doesn't have a node-name?
> 

Well, you can't attach the bitmap to a node you can't identify. Sort of;
If that BDS does not have a name and exists as the root node under a BB,
you can use the device/BB name to attach to that node.

An unintended consequence is that if the root node also has its own name
and you happen to know both the BB and the BDS name, you can use either
interchangeably, but that wasn't the intent. The intent was that you
always use the same pair consistently, a fixed address.

>> If the bitmap were ever to migrate to a new node, this means the
>> user-visible address has been invalidated, which is a leading cause of
>> sadness among digital natives.
>>
>> What I mean to say with the above is that as long as the bitmap stays
>> attached to the node, and the node always keeps that name, I am happy.
>> If, for whatever ungodly reason, we incur a graph reconfiguration where
>> /names are swapped/, the bitmap needs to follow the name instead of the
>> node.
> 
> Got it.
> 
>> "Can that happen?"
>>
>> vvvvvv
>>
>>>>                                                    Any action (if any
>>>> exist) that would change the node name need to move the bitmap to
>>>> wherever the name went
>>>
>>> Do we want to let users change a node's name?
>>>
>>
>> **I sure as hell hope not**, I just wanted to be explicitly clear about
>> what invariant I am trying to maintain here, which is bitmap addressibility.
>>
>> "What's broken, then?"
>>
>> Device names -- if the user uses a device name, it goes to the root
>> node, but the address is through the BB. Graph reconfiguration may
>> actually invalidate this address.
>>
>> "Let's just get rid of creating bitmaps against device names"
>>
>> This is what Kevin is proposing, if I am reading him correctly. It's a
>> good idea overall, but the doubt I have is over the loss of the semantic
>> pairing of a bitmap with a /drive/ versus pairing with a node.
> 
> If there is a use case where the bitmap should stay at the top (whatever
> "top" may be), the natural way to support it would be attaching it to
> the BB (which also disambiguates "top").
> 
> If we attach bitmaps to BBs, then interpreting BB names
> (a.k.a. "device") as BB names makes sense.
> 
> Else, interpreting BB names as sugar for the current root BDS
> (a.k.a. "node") might be convenient.  Especially when the node doesn't
> have a name.  It might also be confusing, because unlike a node name,
> which always names the same BDS, the device name may name different
> BDSes over time.
> 

This is a good insight: If the user literally explicitly wants the root
node (and not the drive), that's ambiguous and hard to do unless you
know the name. If it's nameless, you're SOL.

However, I can't think of a use case off the top of my head where you'd
have a nameless root node (Which I will translate as "A node created
automatically by QEMU during the construction of the graph and not one
explicitly configured by the user for some purpose) that is meaningfully
different from the concept of the "drive as a whole."

I think if there is ever any semantic meaning or worth to attaching to
the root node explicitly, I think the user will (should?) always have a
name for it.

> I've grown wary of prematurely added convenience features.  Too often,
> they become inconvenience misfeatures as our understanding of the
> problem deepens.
> 
>> Casual use does not care about which actual node in the storage graph is
>> performing the IO, we care about the abstract concept of the drive.
>>
>> All of the existing documentation uses drive names to reference the
>> creation and management of bitmaps, and the only supported interface for
>> consuming the bitmaps is /drive-backup/, which again only takes
>> backends. blockdev-backup does not currently support naming a bitmap or
>> the incremental backup mode.
>>
>> So it would be odd to tilt the only add mechanism in favor of the node,
>> when the only usage mechanism uses devices exclusively.
> 
> We're talking about existing interfaces.  We should be talking about use
> cases.
> 

Well, I am replying to Kevin's stated desire to rework the command to
support node semantics only. I'm arguing against that.

>> This is why I wonder if we can't attach bitmaps to both BlockBackends
>> *and* BDS nodes, based on what the user provides in the
>> block-dirty-bitmap-add command.
> 
> Forget about the existing interfaces.  Analyze the problem, explore the
> solution space, and only then figure out what to do for backwards
> compatibility.  Anything else hobbles your mind.
> 

I am using the existing interfaces as a starting point to reason about
what new interface we want. I felt that perhaps some minimal breakage in
semantics but not in literal QAPI was possible, hence my bringing them up.

>>>>                        -- or just refuse to work on nodes with bitmaps
>>>> until the user moves them first. either way.)
>>>>
>>>> that's sufficient to make me happy.
>>>>
>>>> The question is the suitability of the current command.
>>>
>>> Ignorant me again: what's the current command?
>>>
>>
>> void qmp_block_dirty_bitmap_add
>> (const char *node, const char *name,
>> bool has_granularity, uint32_t granularity, Error **errp)
>>
>> If you use a device name, it resolves to the root node, or you can
>> specify an actual node name.
> 
> Parameter name "node" strongly suggests that the bitmap is attached to
> the (fixed) node, not to the (variable) root node of the device.
> 

Yes. By naming this parameter "node," accepting a device name is
semantically a mistake -- since the intent was that the bitmap describes
the whole graph if you specify a device name. I think at the time I was
not aware that the root node of a graph might later NOT be the root
node, so I confused the usages.

The rest of the documentation relies pretty heavily on drive/device
semantics, so the "node" name here is pretty much only a mistake. My
belief was that if you specified a device name (which resolves to a node
internally) that it was reasonable to expect that this bitmap always
described what that device name describes: the whole graph.

Resolving to a node in this case is obviously a mistake if we consider
the later insertion of additional layers above.

If we staple the bitmap to the node it lands up on permanently (even if
we specified a device name), graph reconfiguration will invalidate the
address of the bitmap.

The API is therefore implying two competing and exclusive things. Good
reason to change it.

> Its documentation is unhelpful, though: @node: name of device/node which
> the bitmap is tracking.
> 
>>>> Is there a concrete argument against making bitmap-add work in this way?
>>>> Do we *need* a new command, one for BBs and one for BDSs? We already
>>>> overloaded the existing command to work with either -- what's the
>>>> benefit to splitting it out now?
>>>>
>>>> --js
>>
>> I am hereby proposing as the entirety of my grand vision:
>>
>> 1) Allow bitmaps to attach to BB or BDS
>> 2) Modify block-dirty-bitmap-add to attach to either BB or BDS based on
>> the name given without modifying the QMP API
> 
> Semantic change.  Always awkward, but since it's been broken for a
> release without anybody noticing...
> 

Depends on how much breakage we want. To make things less confusing, we
could scrap the "node" parameter entirely to make it more clear what's
going on (and what to expect on reconfiguration.)

I think if magic is to be avoided, we have to break the API. Otherwise,
this is probably the only reasonable way to keep the command.

>> 3) Remove all bitmap migration code from reconfiguration paths. Do not
>> look back.
>> 4) Sleep soundly.
>>
>> I think this differs from Kevin's plan only in my desire to just re-use
>> the existing API as dual-purpose for both node-and-device semantics.
>>
>> I would also really be OK with replacing the "node=" parameter with a
>> "device or node-name but not both" flavor if some API breakage is
>> desired, but I think having two entirely separate commands might be a
>> bit much.
> 
> We need to make up our minds: do we want to continue conflating BB and
> BDS in the external interface, or do we want to separate them?  This
> goes beyond just bitmaps.
> 

In replying to you this time, I found out that I'm OK with scrapping
node support until we have our story straight. It's not useful for
anything yet, I doubt anyone is using it, and it is just confusing
everything right now.

So let me adjust my story:

I'm fine with bitmap manipulation commands that work only for "devices"
(with implementation on the BB) with the design goal of leaving an
obvious extension for adding support for nodes once we have a solid game
plan for them.

I am in favor of making the choice to choose a BDS or a BB more explicit
in the interface -- the traditional way is the
device-or-node-name-but-not-both approach, which is not well liked but
it is at least explicitly clear in this case.

We could also do something like...

bitmap-add-to-node   node-name, name, ...
bitmap-add-to-drive  device, name, ...

which I feel is a little verbose but is explicit, introspectable, etc.
Maybe it's the only right thing.

--js

>> Also feel that bitmap-moving commands are not important until
>> blockdev-backup supports per-node incremental backups and we document
>> and publish any use cases outside of our full-drives-only story that we
>> currently peddle.
> 
> We shouldn't add features without an actual use.
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]