qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name


From: Kevin Wolf
Subject: Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name
Date: Wed, 3 Dec 2014 11:30:41 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

[ CCed Benoît and Max, this is blockdev work ]
[ CCed Jeff, we're also talking about op blockers ]

Not stripping quoted text for their convenience.

Am 02.12.2014 um 20:06 hat Markus Armbruster geschrieben:
> = Introduction =
> 
> The block layer and its monitor commands have evolved, resulting in a
> bit of a mess when it comes to referring to block layer objects in
> commands.
> 
> Semantically, we refer to two different kinds of objects: backends and
> nodes within backends.
> 
> Example: eject parameter @device names a backend.
> 
> Example: change-backing-file parameter @image-node-name names a node.
> 
> Backend and node names share a name space.  Names are unique.
> Corollary: a name unambiguously identifies either a backend, or a node,
> or nothing.
> 
> Example: blockdev-add parameter @options is a union with discriminator
> "driver".  With its value is "raw", member "file" is an anonymous union.
> Its string value can name either a backend or a node.
> 
> Node names are a fairly recent feature.  Before, we referenced nodes by
> their file name.  Pretty schlocky.  Should be replaced by node name.
> 
> Example: block-commit parameter @base is the "file name of the backing
> image to write data into".  In other words, it identifies a node.  We
> should add a node name parameter, and deprecate @base.
> 
> Some commands predating the node name feature can reference only
> backends even though nodes could make sense, too.  Such restrictions
> should be lifted.  Others reference backends where only nodes make
> sense.  Should be cleaned up.
> 
> Example: drive-mirror parameter @device names a backend.  This restricts
> mirroring to backends.  If we want to support mirroring nodes, we need
> to extend the command to permit referencing nodes.
> 
> Example: block_passwd parameter @device names a backend.  However,
> backends aren't encryped, nodes are.  In 2.0, we cleaned this up: we
> added parameter @node-name.  We kept @device for backward compatibility.
> Either @device or @node-name must be given.
> 
> Note: in block_passwd, we have separate parameters for backend and node
> name.  In the blockdev-add example above, we have only one, and use its
> value to figure out what it is.
> 
> I find this inconsistency rather ugly.  We discussed cleaning it up in
> the "BB name vs. BDS name in the QAPI schema" thread.
> 
> A backend has a tree of nodes.  We call the tree's root the backend's
> root node.
> 
> For convenience, we sometimes accept a backend name when a node name is
> wanted, and resolve it to the backend's root node.
> 
> Example: block_passwd again; it's how its @device parameter works.
> 
> Recent development (blockdev-add) permits nodes to be shared by multiple
> backends.  Op blockers should ensure the sharing is safe.  I wouldn't
> bet a nickel on this to work today.
> 
> 
> = Block layer data structures =
> 
> A backend is represented by a BlockBackend object (short: BB).
> 
> A node is represented by a BlockDriverState object (short: BDS).
> 
> A backend (BB) has a tree of nodes (BDSes).
> 
> Two of a nodes children carry special meaning: the one stored in BDS
> member file (sometimes called "parent"), and the one stored in BDS
> member backing_hd (sometimes called "backing").  These used to be the
> only children a node could have.
> 
> A BB always has a name.  We sometimes call it device name.  Stored in BB
> member name.
> 
> A BDS may have a name.  We call it node name.  Stored in BDS member
> node_name[], empty when the node has no name.
> 
> A BDS has a file name.  The actual matching of a command's file name
> argument against a BDS's file name is complex, but we don't have to
> worry about that here.
> 
> BBs are a fairly recent invention.  Before, a backend was represented by
> a BDS with a non-empty member device_name[].
> 
> 
> = Commands =
> 
> I'm going to discuss all QMP commands whose handlers are defined in
> block*.  To make sense of it, you'll probably have to peruse the command
> documentation in the schema and/or qmp-commands.hx.
> 
> When I think it's fairly obvious what needs to be done for a command, I
> write it down as TODO.  Please challenge it if you think I'm wrong.
> 
> When it's not so obvious, I write it down as questions.  Answers
> appreciated.
> 
> == block-core.json ==
> 
> * block-commit
> 
>   @device names a backend, @top and @base each name one of its nodes via
>   file name matching.
> 
>   TODO: support specifying the two nodes via node name.
> 
>   Why do we need @device when a top node is specified?
> 
>   * We currently need the backend to set op blockers, and finding a
>     node's backend isn't easy.  Implementation detail shows through the
>     interface, blech.
> 
>   * We need to know whether the top node is the active layer.
> 
>     Complication: if it's shared by multiple backends, it may be the
>     active layer in one but not the other.  Specifying the backend
>     resolves the ambiguity.  Whether that makes sense I don't know.

It doesn't.

The real requirement for the commit job (for inactive layers) is that
base...top (I'm using git-rev-parse syntax for describing nodes and
subtrees from here on) is read-only for the duration of the commit
operation. For committing the active layer, we use a mirror job
internally, which works for images that are written during the
operation.

In theory, the mirror should work in all cases, it's just less efficient
(and the implementation of the completion code which reconfigures the
tree would have to be changed). Jeff, is this correct?

What commit should be doing:
- Check whether it can block (as in op blockers) writes to top, i.e. no
  other user is currently requiring the ability to write
- If it can, block writes on base...top and start a commit job
- If it can't, block writes on base...top^ and start a mirror job
- If it can't block at least base...top^, fail

This automatically solves the problem of multiple parents, as long as
these parents advertise correctly which op blocker capabilities they are
using (no, they don't, and we don't have the infrastructure for them to
do so yet - but I think it's becoming clearer what it would have to look
like).

> * block-job-cancel
> * block-job-complete
> * block-job-pause
> * block-job-resume
> * block-job-set-speed
> 
>   @device names a backend.
> 
>   The question whether we need to support a node name here is moot,
>   because jobs shouldn't be tied to a single backend (or node) in the
>   first place.  They should have their own, independent job name.

Correct.

TODO: add a @name argument to commands starting block jobs
TODO: add a @job-name argument to these job management commands and
      deprecate @device

How to get rid of bs->job to actually allow multiple jobs on one BDS and
make the job IDs useful is a separate question that is harder to answer.

> * block-stream
> 
>   @device names a backend, @base names a node via file name matching.
> 
>   TODO: support specifying the node via node name.
> 
>   I think backend is inappropriate here, we should support streaming up
>   to a node, just like block-commit supports committing down from a
>   node.

Yes. I think Benoît suggested this before. I don't remember if he already
coded up something.

> * block_passwd
> * block_resize
> 
>   @node-name names a node.
> 
>   @device names a backend, and references its root node, for
>   compatibility.
> 
>   Either @device or @node-name must be given.
> 
>   TODO: should have single mandatory parameter instead of two optionals.

How is this "fairly obvious what needs to be done"? We can't get rid of
any field because of compatibility. Do you mean this:

TODO: Make @device accept node names as well

> * blockdev-snapshot-sync
> 
>   @node-name and @device as for block_passwd, including TODO.
> 
>   @snapshot-node-name defines the new node's node name.
> 
> * block_set_io_throttle
> 
>   @device names a backend.
> 
>   TODO: support nodes.
> 
>   Note: we'd like to redo throttling as a block filter node, so maybe
>   we'll have a completely different command instead.
> 
> * blockdev-add
> 
>   This command defines a backend and its node tree, where sub-trees may
>   be references to existing nodes.
> 
>   @id defines the backend's name.
> 
>   @node-name defines its root node's node name.
> 
>   Note: blockdev-add always defines a backend.  When you build up a
>   backend bottom-up with several commands, you get a bunch of unwanted
>   backends "in the middle".  I'd like to make @id optional, and omit the
>   backend when it's missing.

Yes, this makes sense, as we discussed privately a while ago.

> * change-backing-file
> 
>   @device names a backend, @image-node-name names a node.
> 
>   As far as I can tell, we need the backend only to set op blockers.
>   Implementation detail shows through the interface.

Once we have the new op blockers, we'll make @device optional then and
completely ignore its value.

> * drive-backup
> 
>   @device names a backend.
> 
>   Do we want to be able to back up any node, or only a backend?
> 
>   Note: documentation of @target sounds like it could somehow name a
>   backend, but as far as I can tell it's always interpreted as file
>   name.

I can't think of a reasonable use case for backing up non-roots because
the only thing that would write to them are other block jobs. So you
could backup the old snapshot contents while committing to it.
Usefulness questionable.

That said, while there's no urgent need for it, it would be nicer to
allow all operations that can safely be performed, and this is one of
them. Might fall out almost natually once we have op blockers, so that
we really only need to add a @node-name option.

But that's something for later.

> * drive-mirror
> 
>   @device names a backend, @replaces names a node, and @node-name
>   defines the name of the new node.
> 
>   Do we want to be able to mirror any node, or only a backend?
> 
>   Note: documentation of @target sounds like it could somehow name a
>   backend, but as far as I can tell it's always interpreted as file
>   name.
> 
>   Note: drive-mirror supports @replaces, but drive-backup doesn't.  Odd.

We probably want to mirror any node (Benoît has a use case for this with
replacing broken quorum children).

> * query-block
> 
>   Returns information on all backends as array of BlockInfo.  BlockInfo
>   member @device is the backend name.
> 
> * query-named-block-nodes
> 
>   Returns information on all named nodes as an array of BlockDeviceInfo
>   (we suck at naming).  BlockDeviceInfo member @node-name is the node
>   name.
> 
>   You can't get information on nodes that don't have a node name.
> 
> * query-blockstats
> 
>   Returns some stats for all backends as array of BlockStats.
>   BlockStats member @device is the backend name.  Member @stats contains
>   the stats for the backend's root node.  Member @parent is BlockStats
>   for the child node that is stored in BDS member file.  Member @backing
>   is BlockStats for the chold node that is stored in BDS member
>   backing_hd.  Stats for other children aren't returned.
> 
>   TODO: generalize this to the full tree, include node names.

Do we want the same thing for query-block?

> * query-block-jobs
> 
>   Returns information on block jobs as array of BlockJobInfo.  A block
>   job is always tied to a backend, and BlockJobInfo member @device is
>   its name.
> 
>   The question whether we need a node name here is moot; see
>   block-job-cancel above.
> 
> == block.json ==
> 
> * blockdev-snapshot-delete-internal-sync
> * blockdev-snapshot-internal-sync
> 
>   @device names a backend.
> 
>   Do we want to be able to snapshot any node, or only a backend?

Probably only a backend. At least until someone comes up with a
reasonable use case. Internal snapshots and backing files don't mix very
well anyway.

> * eject
> 
>   @device names a backend.  Appropriate, because this is really a
>   backend operation.
> 
> * nbd-server-add
> 
>   @device names a backend.
> 
>   I think backend is appropriate here.  The NBD server sits on top of
>   the block layer just like device models do.  It should therefore use
>   the BB API.  See Max's [PATCH v2 0/6] nbd: Use BlockBackend.

Agreed.

> * nbd-server-start
> * nbd-server-stop
> 
>   No backend or node name parameters.
> 
> == qapi-schema.json ==
> 
> * transaction
> 
>   This is a wrapper around a list of transaction-capable commands.
>   Thus, nothing new here.

Surprisingly few really hard questions.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]