qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to blo


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
Date: Thu, 13 Jun 2013 10:51:48 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Thu, Jun 13, 2013 at 02:33:40PM +0800, Fam Zheng wrote:
> On Thu, 06/13 14:07, Wenchao Xia wrote:
> > 于 2013-6-13 14:03, Wenchao Xia 写道:
> > >于 2013-6-7 15:18, Stefan Hajnoczi 写道:
> > >>On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote:
> > >>>On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
> > >>>>On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
> > >>>>>On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
> > >>>>>>+
> > >>>>>>+static int coroutine_fn backup_before_write_notify(
> > >>>>>>+        NotifierWithReturn *notifier,
> > >>>>>>+        void *opaque)
> > >>>>>>+{
> > >>>>>>+    BdrvTrackedRequest *req = opaque;
> > >>>>>>+
> > >>>>>>+    return backup_do_cow(req->bs, req->sector_num,
> > >>>>>>req->nb_sectors, NULL);
> > >>>>>>+}
> > >>>>>
> > >>>>>I'm wondering if we can see the logic here with a backing hd
> > >>>>>relationship?  req->bs is a backing file of job->target, but guest is
> > >>>>>going to write to it, so we need to COW down the data to job->target
> > >>>>>before overwritting (i.e.  cluster is not allocated in child).
> > >>>>>
> > >>>>>I think if we do this in block layer, there's not much necessity for a
> > >>>>>before-write notifier here (although it may be useful for other
> > >>>>>cases):
> > >>>>>
> > >>>>>     in bdrv_write:
> > >>>>>     for child in req->bs->open_children
> > >>>>>         if not child->is_allocated(req->sectors)
> > >>>>>             do COW to child
> > >>>>>
> > >>>>>The advantage of this is that we won't need to start block-backup
> > >>>>>job in
> > >>>>>sync mode "none" to do point-in-time snapshot (image fleecing), and we
> > >>>>>get writable snapshot (possibility to open backing file writable and
> > >>>>>write to it safely) as a by-product.
> > >>>>>
> > >>>>>But we will need to keep track of parent<->child of block states,
> > >>>>>and we
> > >>>>>still need to take care of overlapping writing between block job and
> > >>>>>guest request.
> > >>>>
> > >>>>There's one catch here: bs->target may not support backing files, it
> > >>>>can
> > >>>>be a raw file, for example.  We'll only use backing files for
> > >>>>point-in-time snapshots but other use cases might not.  raw doesn't
> > >>>>really implement is_allocated(), so the whole concept would have to
> > >>>>change a little:
> > >>>
> > >>>Another use case may be parent modification. Suppose we have
> > >>>
> > >>>                     ,--- child1.qcow2
> > >>>     parent.qcow2  <
> > >>>                     `--- child2.qcow2
> > >>>
> > >>>We can use parent.qcow2 as block device in QEMU without breaking
> > >>>child1.qcow2 or child2.qcow2 by telling QEMU who its children are:
> > >>>
> > >>>   $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2
> > >>>
> > >>>Then we open the three images and setup parent_bs->open_children, the
> > >>>children are protected from being corrupted.
> > >>>
> > >>>>
> > >>>>bs->open_children becomes independent of backing files - any
> > >>>>BlockDriverState can be added to this list.  ->is_allocated() basically
> > >>>>becomes the bitmap that we keep in the block job.
> > >>>
> > >>>Yes. But it is possible to keep a bitmap for raw (and those don't
> > >>>implement is_allocated()) in block layer too, or in overlay: could
> > >>>add-cow by Dongxu Wang help here?
> > >>
> > >>Yes absolutely.
> > >>
> > >>Stefan
> > >>
> > >   One advantage of external backup, or backing up chain, is that it
> > >holds 'Delta' data only and is small enough. If it is changed toward a
> > >'full' data writable snapshot, it become bigger. With backup chain
> > >qemu-img can restore/clone a writable and usable one, So I don't
> > >think adding that in qemu emulator helps much, and it will make things
> > >more complicit.... user won't care who is doing the job, qemu or
> > >qemu-img.
> > >
> >   I mean that "get writable snapshot (possibility to open backing file
> > writable and write to it safely) as a by-product." in this series, is
> > not very valuable.
> > 
> 
> I'm not selling writable snapshot, my point was just that semantic of
> block-backup, getting a point-in-time snapshot, inherently works like a
> backing chain but writting to parent (guest drive) will not break its
> children (our thin PIT snapshot). If we see it this way, COW is not so
> specific to a block job like block-backup, it can be generic in the
> backing chain logic.
> 
> Though, the value in a writable snapshot is that we can actually
> _modify_ a backing image in place, rather than forking the chain to
> write to the new child. This is not supported with qemu or qemu-img now,
> once you create a child with the image as backing file, you mustn't
> modify it.

Supporting writable snapshots in this style is like traditional LVM
snapshots, it requires O(n) writes where n is the number of children.
So it does not scale (LVM recently added the thin provisioning target to
use a shared storage pool and solve this problem).

The second challenge with writable snapshots is that you can only use
them when the QEMU process knows about all children.  That means you
cannot use them in the common use-case where there is a template backing
file:

  web-server.qcow2 <- web001.qcow2
                   <- web002.qcow2
                   <- web003.qcow2

The web001 guest doesn't know about web002 and web003.  Even if it did,
it would be dangerous to modify web-server.qcow2 while the other two
QEMU processes have it open.

For these reasons I'm not eager to get into writable backing files,
better to create a new writable image.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]