qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format


From: Anthony Liguori
Subject: Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format
Date: Mon, 13 Sep 2010 08:19:45 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.12) Gecko/20100826 Lightning/1.0b1 Thunderbird/3.0.7

On 09/13/2010 06:48 AM, Kevin Wolf wrote:
Am 13.09.2010 13:34, schrieb Avi Kivity:
   On 09/13/2010 01:28 PM, Kevin Wolf wrote:
Anytime you grow the freelist with qcow2, you have to write a brand new
freelist table and update the metadata synchronously to point to a new
version of it.  That means for a 1TB image, you're potentially writing
out 128MB of data just to allocate a new cluster.
No. qcow2 has two-level tables.

File size: 1 TB
Number of clusters: 1 TB / 64 kB = 16 M
Number of refcount blocks: (16 M * 2 B) / 64kB = 512
Total size of all refcount blocks: 512 * 64kB = 32 MB
Size of recount table: 512 * 8 B = 4 kB

When we grow an image file, the refcount blocks can stay where they are,
only the refcount table needs to be rewritten. So we have to copy a
total of 4 kB for growing the image file when it's 1 TB in size (all
assuming 64k clusters).

The other result of this calculation is that we need to grow the
refcount table each time we cross a 16 TB boundary. So additionally to
being a small amount of data, it doesn't happen in practice anyway.
Interesting, I misremembered it as 8 bytes per cluster, not 2.  So it's
actually fairly dense (though still not as dense as a bitmap).
Yes, refcounts are 16 bit. Just checked it with the code once again to
be 100% sure. But if it was only that, it would be just a small factor.
The important part is that it's a two-level structure, so Anthony's
numbers are completely off.

A two-level structure makes growth more efficient, however, searching for a free cluster is still an expensive operation on large disk images. This is an important point because without snapshots, the argument for a refcount table is supporting UNMAP and efficient UNMAP support in qcow2 looks like it will require an additional structure.

One of the troubles with qcow2 as a format is that the metadata on disk is redundant, it's already defined as authoritative. So while in QED, we can define the L1/L2 tables as the only authoritative source of information and treat a freelist as an optimization, the refcount table must remain authoritative in qcow2 in order to remain backwards compatible.

You could rewrite the header to be qcow3 in order to relax this restriction but then you lose image mobility to older versions which really negates the advantage of not introducing a new format.

Regards,

Anthony Liguori

Regards,

Anthony Liguori

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]