qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 for 2.1 00/10] Modify block jobs to use node-


From: Jeff Cody
Subject: Re: [Qemu-devel] [PATCH v6 for 2.1 00/10] Modify block jobs to use node-names
Date: Thu, 19 Jun 2014 12:26:00 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

On Thu, Jun 19, 2014 at 05:17:16PM +0800, Stefan Hajnoczi wrote:
> On Tue, Jun 17, 2014 at 05:53:48PM -0400, Jeff Cody wrote:
> > Changes from v5->v6:
> > 
> > * Check for attempt to commit an image to itself (Eric)
> > * Add a comment to the bdrv_find for block-commit, indicating
> >   that libvirt uses the error case for probing (Eric)
> > * Added Benoit's R-b's
> > 
> > Changes from v4->v5:
> > 
> > * Rebased on master
> > * Fixed commit log typos / stale paragraphs (Eric)
> > * Fixed comment typo (Eric)
> > * Added Eric's remaining R-b's
> > 
> > 
> > Changes from v3->v4:
> > 
> > * Rebased on master
> > * Dropped overlay pointers, Eric's concerns are correct
> > * Require "device" for all arguments, in light of the above,
> >   so we can find the active layer in all cases.
> > * Simplify more operations!
> > * Dropped Eric's Reviewed-by: on patches 3,5,6
> >   Added Eric's Reviewed-by: on patches 8,9
> > 
> > 
> > Changes from v2->v3:
> > 
> > * Add Eric's reviewed-by
> > * Addressed Eric's review comments
> > * Dropped HMP changes
> > * Added helper function for setting the overlay, and
> >   set the overlay in bdrv_append()
> > * Use bs->backing_file instead of bs->backing_hd->filename in block_stream 
> > 
> > Using node-names instead of filenames for block job operations
> > over QMP is a superior method of identifying the block driver
> > images to operate on, as it removes all pathname ambiguity.
> > 
> > This series modifies block-commit and block-stream to use node-names,
> > and creates a new QAPI command to allow stand-alone backing file
> > changes on an image file.
> > 
> > So that node-names can be used as desired for all block job
> > operations, this series also auto-generates node-names for every
> > BDS.  User-specified node-names will override any autogenerated
> > 
> > Jeff Cody (10):
> >   block: Auto-generate node_names for each BDS entry
> >   block: add helper function to determine if a BDS is in a chain
> >   block: simplify bdrv_find_base() and bdrv_find_overlay()
> >   block: make 'top' argument to block-commit optional
> >   block: Accept node-name arguments for block-commit
> >   block: extend block-commit to accept a string for the backing file
> >   block: add ability for block-stream to use node-name
> >   block: add backing-file option to block-stream
> >   block: Add QMP documentation for block-stream
> >   block: add QAPI command to allow live backing file change
> > 
> >  block.c                   |  80 ++++++++--------
> >  block/commit.c            |   9 +-
> >  block/stream.c            |  11 +--
> >  blockdev.c                | 238 
> > ++++++++++++++++++++++++++++++++++++++++++----
> >  hmp.c                     |   1 +
> >  include/block/block.h     |   4 +-
> >  include/block/block_int.h |   3 +-
> >  qapi/block-core.json      | 145 +++++++++++++++++++++++++---
> >  qmp-commands.hx           | 181 +++++++++++++++++++++++++++++++++--
> >  tests/qemu-iotests/040    |  28 ++++--
> >  10 files changed, 602 insertions(+), 98 deletions(-)
> 
> This series side-steps lack of child op blockers by checking only the
> root node/drive.
>

Yes.  The lack of child op blockers is a definite issue.

> Existing node-name commands like resize and snapshot-sync check for op
> blockers on the actual node.  They do not take the same approach as this
> patch series.
>
> We have a mess and I don't want to commit this series before we've
> figured out what to do about child op blockers.

Why?  The problem doesn't go away if you don't commit this series;
-commit and -stream will still just check the topmost overlay.  The
problem exists independent of this series.  All this series does for
commit and stream is offer another way to specify an image.  When a
series does address the blockers, it will likely need to go through
and touch all of the block job handlers.

Having said that, to be fair, the new QAPI command change-backing-file
does propagate this top-layer in-use flag semantic, but I would prefer
that patch to be dropped rather than not committing this series.

But if this series went in as-is for 2.1, nothing would change
regarding how -commit or -stream determines when to abort.



On to the discussion:
>
> Let's discuss this topic in a sub-thread and figure out what to do for
> QEMU 2.1.  This is an important issue to solve before the release
> because we can't change QMP command semantics easily later.
> 
> My questions are:
> a. How do we fix resize, snapshot-sync, etc?  It seems like we need to
>    propagate child op blockers.
> 
> b. Is it a good idea to perform op blocker checks on the root node?
>    It's inconsistent with resize, snapshot-sync, etc.  Permissions in
>    BDS graphs with multiple root nodes (e.g. guest device and NBD
>    run-time server) will be different depending on which root you
>    specify.

I don't think (b) is the ultimate solution.  It is used as a stop-gap
because op blockers in the current implementation is essentially
analogous to the in-use flag.  But is it good enough for 2.1?  If
*everything* checks the topmost node in 2.1, then I think we are OK in
all cases except where images files share a common BDS.

The ability for internal BDSs to share a common base BDS makes some
block jobs unsafe currently, I believe.  A crude and ugly fix is to
only allow a single block-job to occur at any given time, but that
doesn't seem feasible, so let's ignore that.

Perhaps, for 2.1, provide an overlay pointer list inside each BDS
(some of my earlier patches in this series had a single overlay, but
that is not enough).  We could then apply op blockers to the topmost
nodes for any affected BDS image in a chain, by navigating upwards.
Not sure how complex this would be in practice, though.  

We could also apply child blockers to all nodes in all directions in a
graph, if we don't want to rely on the topmost image as a blocker
proxy for the whole drive.

> 
> c. We're painting ourselves into a corner by using the root node for op
>    blocker checks.  We'll have to apply the same op blockers to all
>    nodes in a graph.  There's no opportunity to apply different op
>    blockers to a subset of the child nodes.  I *think* this can be
>    changed later without affecting the QMP API, so it's not a critical
>    issue.

We've already painted ourselves in that corner, alas.

I agree that from a QAPI perspective, the change is not critical:
once op blockers are correctly applied to all child nodes, any API
change (e.g. commit or stream) would likely be optional only (such as,
making 'device' optional instead of mandatory), and thus discoverable.

> 
> The answer seems to be that op blockers should be propagated to all
> child nodes and commands should test the node, not the drive/root node.
> That gives us the flexibility for per-node op blockers in the future and
> maintains compatibility with existing node-name users.
> 

Long term thoughts:

So if I think of operations that are done on block devices from a
block job, and chuck them into categories, I think we have:

1) Read of guest-visible data
2) Write of guest-visible data
3) Read of host-visible data (e.g. image file metadata)
4) Write of host-visible data (e.g. image file metadata, such as
the backing-file)
5) Block chain manipulations (e.g. movement of a BDS, change to r/w
instead of r/o, etc..)
6) I/O attribute changes (e.g. throttling, etc..)

Does this make sense, and did I miss any (likely so)?  It seems like
we should issue blockers not based on specific commands (e.g.
BLOCK_OP_TYPE_COMMIT), but rather based on what specific category of
interaction on a BDS we want to prohibit / allow.

I don't think specific command blockers provide enough granularity,
and doesn't necessarily scale well as new commands are added. It
forces a new block job author to go through the specific
implementation of the other block job commands, and interpret what
operations to prohibit based on what other jobs do.  Whereas if each
command issues blockers based on the operation category, it takes care
of itself, and I just issue blockers based on my block job behavior.

Each command would then issue appropriate operational blockers to each
BDS affected by an operation.  For instance, at first blush, I think
block-commit would want (at the very least) to block (and check) the
following, in this example chain:


     [base] <-- [int1] <--  [int2] <-- [int3] <-- [top] <-- [overlay]

     becomes:

     [base] <-- [overlay]


Blocked operations per image:

* 'base' image

Operation       |  Blocked
-----------------------------
GUEST READ      |   Y
GUEST WRITE     |   Y
HOST READ       |    
HOST WRITE      |    
CHAIN           |   Y
I/O ATTRIBUTE   |


* Intermediate images between 'base' up to and including 'top':

Operation       |  Blocked
-----------------------------
GUEST READ      |   
GUEST WRITE     |   Y
HOST READ       |
HOST WRITE      |   Y
CHAIN           |   Y
I/O ATTRIBUTE   |


* The overlay of 'top', if it exists:

Operation       |  Blocked
-----------------------------
GUEST READ      |   
GUEST WRITE     |    
HOST READ       |
HOST WRITE      |   Y
CHAIN           |   Y
I/O ATTRIBUTE   |



reply via email to

[Prev in Thread] Current Thread [Next in Thread]