qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/5] RFC: Efficient VM backup for qemu (v1)


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH 1/5] RFC: Efficient VM backup for qemu (v1)
Date: Wed, 21 Nov 2012 11:48:36 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0

Am 21.11.2012 10:01, schrieb Dietmar Maurer:
> +Some storage types/formats supports internal snapshots using some kind
> +of reference counting (rados, sheepdog, dm-thin, qcow2). It would be possible
> +to use that for backups, but for now we want to be storage-independent.
> +
> +Note: It turned out that taking a qcow2 snapshot can take a very long
> +time on larger files.

Hm, really? What are "larger files"? It has always been relatively quick
when I tested it, though internal snapshots are not my focus, so that
need not mean much.

If this is really an important use case for someone, I think qcow2
internal snapshots still have some potential for relatively easy
performance optimisations.

But that just as an aside...

> +
> +=Make it more efficient=
> +
> +The be more efficient, we simply need to avoid unnecessary steps. The
> +following steps are always required:
> +
> +1.) read old data before it gets overwritten
> +2.) write that data into the backup archive
> +3.) write new data (VM write)
> +
> +As you can see, this involves only one read, an two writes.

Looks like a nice approach to backup indeed.

The question is how to fit this into the big picture of qemu's live
block operations. Much of it looks like an active mirror (which is still
to be implemented), with the difference that it doesn't write the new,
but the old data, and that it keeps a bitmap of clusters that should not
be mirrored.

I'm not sure if this means that code should be shared between these two
or if the differences are too big. However, both of them have things in
common regarding the design. For example, both have a background part
(copying the existing data) and an active part (mirroring/backing up
data on writes). Block jobs are the right tool for the background part.

The active part is a bit more tricky. You're putting some code into
block.c to achieve it, which is kind of ugly. We have been talking about
"block filters" previously that would provide a generic infrastructure,
and at least in the mid term the additions to block.c must disappear.
(Same for block.h and block_int.h - keep things as separated from the
core as possible) Maybe we should introduce this infrastructure now.

Another interesting point is how (or whether) to link block jobs with
block filters. I think when the job is started, the filter should be
inserted automatically, and when you cancel it, it should be stopped.
When you pause the job... no idea. :-)

> +
> +To make that work, our backup archive need to be able to store image
> +data 'out of order'. It is important to notice that this will not work
> +with traditional archive formats like tar.

> +* works on any storage type and image format.
> +* we can define a new and simple archive format, which is able to
> +  store sparse files efficiently.

> +
> +Note: Storing sparse files is a mess with existing archive
> +formats. For example, tar requires information about holes at the
> +beginning of the archive.

> +* we need to define a new archive format
> +
> +Note: Most existing archive formats are optimized to store small files
> +including file attributes. We simply do not need that for VM archives.
> +
> +* archive contains data 'out of order'
> +
> +If you want to access image data in sequential order, you need to
> +re-order archive data. It would be possible to to that on the fly,
> +using temporary files.
> +
> +Fortunately, a normal restore/extract works perfectly with 'out of
> +order' data, because the target files are seekable.

> +=Archive format requirements=
> +
> +The basic requirement for such new format is that we can store image
> +date 'out of order'. It is also very likely that we have less than 256
> +drives/images per VM, and we want to be able to store VM configuration
> +files.
> +
> +We have defined a very simply format with those properties, see:
> +
> +docs/specs/vma_spec.txt
> +
> +Please let us know if you know an existing format which provides the
> +same functionality.

Essentially, what you need is an image format. You want to be
independent from the source image formats, but you're okay with using a
specific format for the backup (or you wouldn't have defined a new
format for it).

The one special thing that you need is storing multiple images in one
file. There's something like this already in qemu: qcow2 with its
internal snapshots is basically a flat file system.

Not saying that this is necessarily the best option, but I think reusing
existing formats and implementation is always a good thing, so it's an
idea to consider.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]