qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/5] RFC: Efficient VM backup for qemu (v1)


From: Dietmar Maurer
Subject: Re: [Qemu-devel] [PATCH 1/5] RFC: Efficient VM backup for qemu (v1)
Date: Wed, 21 Nov 2012 15:47:16 +0000

> >> Ah, this is what you mean by "out of order". Just out of curiosity,
> >> what are these non-seekable backup fds usually?
> >
> > /dev/nst0 ;-)
> 
> Sure. :-)
> 
> > But there are better examples. Usually you want to use some kind of
> > compression, and you do that with existing tools:
> >
> > # backup to stdout|gzip|...
> 
> When you use an image/archive format anyway, you could use a
> compression mechanism that it already supports.

Many archive formats does not support compressions internally (tar, cpio, ..).
I also avoided to include that in the 'vma' format. So you can use any
external tool. 

Some user wants to compress, other wants bzip2, or gzip -1, xz, pgzip, ...
Or maybe pipe into some kind of encryption tool ...

> > A common usage scenario is to pipe a backup into a restore (copy)
> >
> > # backup to stdout|ssh to remote host -c 'restore from stdin'
> 
> This is a good one. I believe our usual solution would have been to backup to
> a NBD server on the remote host instead.
> 
> In general I can see that being able to pipe it to other programs could be 
> nice.
> I'm not sure if it's an absolute requirement. Would your tools for taking the
> backup employ any specific use of pipes?

Yes, we currently have that functionality, and I do not want to remove features.

> > It is also a performance question. Seeks are terrible slow.
> 
> You wouldn't do it a lot. Only for metadata, and you would only write out the
> metadata once the in-memory cache is full.

IMHO it is still much better to write sequentially, because that has 'zero' 
overhead.

Besides, writing data sequentially is so much easier (on the implementation 
side)

The current VMA code also use checksums and special 'uuid' markers, which
makes it possible to find and recover damaged archives. I guess such things
are quite impossible with qcow2, or very hard to do?

> >> In principle even for this qcow2 could be used as an image format,
> >> however the existing implementation wouldn't be of much use for you,
> >> so it loses quite a bit of its attractiveness.
> >>
> >>> Anyways, a qcow2 file is really complex beast - I am quite unsure if
> >>> I would use that for backup if it is possible.
> >>>
> >>> That would require any external tool to include >=50000 LOC
> >>>
> >>> The vma reader code is about 700 LOC (quite easy).
> >>
> >> So what? qemu-img is already there.
> >
> > Anyways, you already pointed out that the existing implementation does
> not work.
> 
> I'm still trying to figure out the real requirements to think some more about
> it. :-)

Any existing archive format I know works on pipes (without seeks). 
Well, that does not really mean anything.

> > But I already expected such discussion. So maybe it is better we simply pipe
> all data to an external binary?
> > We just need to define a minimal protocol.
> >
> > In future we can produce different archivers as independent/external
> binaries?
> 
> You shouldn't look at discussions as a bad thing. We're not trying to block
> your changes, but to understand and possibly improve them.

I do not consider your comments as 'bad thing' - above idea was a real 
suggestion ;-)

I already have plans to use a Content Addressable Storage (instead of 'vma'), so
such plugin architecture makes it easier to play around with different formats.
 
> Yes, discussions mean that it takes a bit longer to get things merged, but 
> they
> also mean that usually something better is merged in the end that actually
> fits well in qemu's design, is maintainable, generic and so on. Evading the
> discussions by keeping code externally wouldn't improve things.

sure.
 
> Which doesn't mean that external archivers are completely out of the
> question, but I would only consider them if there's a good technical reason to
> do so.

As noted above, I can see rooms for different format. 

1.) 'vma' is my proof of concept, easy to implement and use.
2.) CAS - very useful to sync backup data across datacenters (this
gives us deduplication and kind of 'incremental backups')
3.) support existing archive format like 'tar' (this is possible if we
use temporary files to store out-of-order data)
4.) backup to some kind of external server
5.) plugins for existing backup tools (bacula, ...)?

> So if eventually we come to the conclusion that vma (or for that matter,
> anything else in your patches) is the right solution, let's take it. But first
> please give us the chance to understand the reasons of why you did things
> the way you did them, and to discuss the pros and cons of alternative
> solutions.

Sure. I was not aware that I wrote something negative in the previous reply - 
sorry for that.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]