qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QCow2 compression


From: Eric Blake
Subject: Re: [Qemu-devel] QCow2 compression
Date: Fri, 4 Mar 2016 15:19:55 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

[any way you could convince your mailer to not break threading?]

On 03/03/2016 09:24 PM, address@hidden wrote:
>>
>> The zeros are not part of the compressed data, though, that's why the 
>> Compressed Cluster Descriptor indicates a shorter size. Had another 
>> compressed cluster been written to the same image, it might have ended 
>> up where you are seeing the zero padding now. (The trick with 
>> compression is that multiple guest clusters can end up in a single host 
>> cluster.) 
>>
>  
> Thanks, but the given length of 0x5600 is still short by 160(decimal) bytes 
> compared to the 
> non-zero data (which occupies an additional 133 bytes beyond the expected end 
> at 
> 0x3DED50) and zero 
> padding (an additional 27 bytes beyond that). Could there be an off-by-one 
> error 
> somewhere? 
> The file doesn't even end on a sector boundary let alone a cluster boundary. 

Based on an IRC conversation I had when you first asked the question, I
think the spec is indeed weak, and that we DO have some fishy code.

Look at what the code does:

uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
                                               uint64_t offset,
                                               int compressed_size)
...
    nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) -
                  (cluster_offset >> 9);

    cluster_offset |= QCOW_OFLAG_COMPRESSED |
                      ((uint64_t)nb_csectors << s->csize_shift);

That sure does sound like an off-by-one.  cluster_offset does indeed
look like a byte offset (from qcow2_alloc_bytes); so let's consider what
happens if we've already allocated one compressed cluster in the past,
going from 65536 to 66999.  So on this call, cluster_offset would be
67000, and if compressed size is 1025 (just round numbers to make
discussion easy; no idea if gzip would really do this on any particular
data), we are computing ((67000+1025-1)>>9) - (67000>>9) == 2, but 1025
bytes occupies 3 sectors, not 2.

But we offset it by another off-by-one:

int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset)
{
...
        nb_csectors = ((cluster_offset >> s->csize_shift) &
s->csize_mask) + 1;

Yuck.  That is horribly ugly.

We need to fix the documentation to make it obvious that the sector
count is a _LOWER BOUND_ on the number of sectors occupied, and that you
need to read at least one more cluster's worth of data before decompressing.

It would also be nice to fix qcow2 code to not have quite so many
off-by-one computations, but be more precise about what data is going where.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]