qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: Strategic decision: COW format


From: Anthony Liguori
Subject: Re: [Qemu-devel] Re: Strategic decision: COW format
Date: Mon, 14 Mar 2011 09:47:45 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Lightning/1.0b2 Thunderbird/3.1.8

On 03/14/2011 09:15 AM, Kevin Wolf wrote:
The file system can keep a lot of these things around pretty easily but
with your proposal, it seems like there can only be one.  If you support
many of them, I think you'll degenerate to something as complex as a
reference count table.
IIUC, he already uses a refcount table.

Well, he needs a separate mechanism to make trim/discard work, but for the snapshot discussion, a reference count table is avoided.

The bitmap only covers whether the guest has accessed a block or not. Then there is a separate table that maps guest offsets to offsets within the file.

I haven't thought hard about it, but my guess is that there is an ordering constraint between these two pieces of metadata which is why the journal is necessary. I get worried about the complexity of a journal even more than a reference count table.

  Actually, I think that a
refcount table is a requirement to provide the interesting properties
that internal snapshots have (see my other mail).

Well the trick here AFAICT is that you're basically storing external snapshots internally. So it's sort of like a bunch of FVD formats embedded into a single image.

Refcount tables aren't a very complex thing either. In fact, it makes a
format much simpler to have one concept like refcount tables instead of
adding another different mechanism for each new feature that would be
natural with refcount tables.

I think it's a reasonable design goal to minimize any metadata updates in the fast path. If we can write 1 piece of metadata verses writing 2, then it's worth exploring IMHO.

The only problem with them is that they are metadata that must be
updated. However, I think we have discussed enough how to avoid the
greatest part of that cost.

Maybe I missed it, but in the WCE=0 mode, is it really possible to avoid the writes for the refcount table?

On the other hand, I think it's reasonable to just avoid the CoW overlay
entirely and say that moving to a previous snapshot destroys any of it's
children.  I think this ends up being a simplifying assumption that is
worth investigating further.

  From the use-cases that I'm aware of (backup and RAS), I think these
semantics are okay.
I don't think this semantics would be expected. Any anyway, would this
really allow simplification of the format?

I don't know, I'm really just trying to separate out the implementation of the format to the use-cases we're trying to address.

Even if we're talking about qcow3, then if we only really care about read-only snapshots, perhaps we can add a feature bit for this and take advantage of this to make the WCE=0 case much faster.

But the fundamental question is, does this satisfy the use-cases we care about?

Regards,

Anthony Liguori

  I'm afraid that you would go
for complicated solutions with odd semantics just because of an
arbitrary dislike of refcounts.

Kevin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]