Re: [Qemu-devel] QCow2 compression

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QCow2 compression

From:	Kevin Wolf
Subject:	Re: [Qemu-devel] QCow2 compression
Date:	Mon, 29 Feb 2016 15:01:24 +0100
User-agent:	Mutt/1.5.21 (2010-09-15)

[ Cc: qemu-block ]

Am 27.02.2016 um 06:00 hat address@hidden geschrieben:
> Hello, I am hoping someone here can help me. I am implementing QCow2
> support for a PC emulator project and have a couple questions
> regarding compression I haven't been able to figure out on my own.
> 
> First some background: I am using the information I found at
> https://people.gnome.org/~markmc/qcow-image-format.html and I have
> implemented working support for QCow2 images as described there except
> for snapshots, encryption, and compression. Of these, only compression
> is of immediate use to me.

First of all, the preferable source is the qcow2 specification from the
QEMU git repository:

http://git.qemu.org/?p=qemu.git;a=blob;f=docs/specs/qcow2.txt

The description you were using is good, but rather old. Not a problem
for the basics of compression because these things haven't ever changed,
but if you want to make sense of everything in a current image, you'll
need something more recent.

> I have some QCow2 images all using 16-bit clusters created using
> qemu-img 2.1.2 (the version bundled with Debian stable). According to
> the documentation I linked, 8 bits of an L2 table entry following the
> copy flag should say how many 512 byte sectors a compressed cluster
> takes up and the remaining bits are the actual offset of the cluster
> within the file.

The spec says this (which is essentially the same):

L2 table entry:

    Bit  0 -  61:   Cluster descriptor

              62:   0 for standard clusters
                    1 for compressed clusters

              63:   0 for a cluster that is unused or requires COW, 1 if its
                    refcount is exactly one. This information is only accurate
                    in L2 tables that are reachable from the active L1
                    table.

Compressed Clusters Descriptor (x = 62 - (cluster_bits - 8)):

    Bit  0 -  x:    Host cluster offset. This is usually _not_ aligned to a
                    cluster boundary!

       x+1 - 61:    Compressed size of the images in sectors of 512 bytes

> I have for example a compressed cluster with an L2 entry value of 4A
> C0 00 00 00 3D 97 50. This would lead me to believe the cluster starts
> at offset 0x3D9750 and has a length of 0x2B 512-byte sectors (or 0x2B
> times 0x200 = 0x5600). Added to the offset this would give an end for
> the cluster at offset 0x3DED50. However, it is clear from looking at
> the image that the compressed cluster extends further, the data ending
> at 0x3DEDD5 and being followed by some zero padding until 0x3DEDF0
> where the file ends. How can I know the data extends beyond the length
> I calculated? Did I misunderstand the documentation somewhere? Why
> does the file end here versus a cluster aligned offset?

This zero padding happens in the very last cluster in the image in order
to ensure that the image file is aligned to a multiple of the cluster
size (qcow2 images are defined to consist of "units of constant size",
i.e. only full clusters).

The zeros are not part of the compressed data, though, that's why the
Compressed Cluster Descriptor indicates a shorter size. Had another
compressed cluster been written to the same image, it might have ended
up where you are seeing the zero padding now. (The trick with
compression is that multiple guest clusters can end up in a single host
cluster.)

> A final question: I noticed that compressed clusters typically have a
> reference count higher than one, yet there are no snapshots present in
> the image. I suspect the count is incremented for each compressed
> cluster that exists even partially within a normal sized cluster
> region of the file, but I can find no documentation to this effect and
> am merely speculating. Am I correct?

Yes. You have multiple L2 entries referring to the same cluster, so it
needs to have a refcount that represents this.

Once you overwrite a compressed cluster, a copy-on-write operation is
performed and the refcount is decreased. You want to free the (host)
cluster holding the compressed data only after the last L2 entry using
it has gone.

> If it is the wrong place to ask these questions, I would appreciate it
> if anyone could direct me to a more appropriate venue. Thank you for
> taking the time to read this and tanks in advance for any assistance.

qemu-devel is alright for this kind of questions. I'm also copying
qemu-block now, which makes the email thread more visible for the
relevant people (as qemu-devel is relatively high traffic these days),
but qemu-devel should always be included anyway.

Kevin

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] QCow2 compression, mgreger, 2016/02/27
- Re: [Qemu-devel] QCow2 compression, Kevin Wolf <=
  - Re: [Qemu-devel] QCow2 compression, Eric Blake, 2016/02/29
- Re: [Qemu-devel] QCow2 compression, Eric Blake, 2016/02/29
  - Re: [Qemu-devel] QCow2 compression, Eric Blake, 2016/02/29

Prev by Date: Re: [Qemu-devel] [mttcg] cputlb: Use async tlb_flush_by_mmuidx
Next by Date: Re: [Qemu-devel] [mttcg] cputlb: Use async tlb_flush_by_mmuidx
Previous by thread: [Qemu-devel] QCow2 compression
Next by thread: Re: [Qemu-devel] QCow2 compression
Index(es):
- Date
- Thread