qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 18/20] qcow2: Switch store_bitmap_data() to b


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v8 18/20] qcow2: Switch store_bitmap_data() to byte-based iteration
Date: Tue, 19 Sep 2017 14:42:59 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0

On 09/19/2017 07:44 AM, Kevin Wolf wrote:
> Am 18.09.2017 um 20:58 hat Eric Blake geschrieben:
>> Now that we have adjusted the majority of the calls this function
>> makes to be byte-based, it is easier to read the code if it makes
>> passes over the image using bytes rather than sectors.
>>
>> Signed-off-by: Eric Blake <address@hidden>
>> Reviewed-by: John Snow <address@hidden>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <address@hidden>

>> @@ -1100,18 +1099,17 @@ static uint64_t *store_bitmap_data(BlockDriverState 
>> *bs,
>>      dbi = bdrv_dirty_iter_new(bitmap);
>>      buf = g_malloc(s->cluster_size);
>>      limit = bytes_covered_by_bitmap_cluster(s, bitmap);

Limit is huge.  For ease of math, let's consider an image with 512-byte
clusters, and an explicit bitmap granularity of 512 bytes.  One cluster
(512 bytes, or 4k bits) of the bitmap covers 4k multiples of the
granularity, or 2M of image; and the minimum serialization alignment of
64 bits covers 32k bytes of image.  bytes_covered_by_bitmap_cluster()
computes granularity (512 bytes per bit), limit (2M bytes per cluster of
bits), and checks that 2M is indeed aligned to
bdrv_dirty_bitmap_serialization_align() (which is 32k).

Next, an image with default 64k clusters and a default bitmap
granularity of 64k bytes; one cluster (64k bytes, or 512k bits) now
covers 512k multiples of the granularity, or 32G of image size.

However...

>> -    sbc = limit >> BDRV_SECTOR_BITS;
>>      assert(DIV_ROUND_UP(bm_size, limit) == tb_size);
>>
>> -    while ((sector = bdrv_dirty_iter_next(dbi) >> BDRV_SECTOR_BITS) >= 0) {
>> -        uint64_t cluster = sector / sbc;
>> +    while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
>> +        uint64_t cluster = offset / limit;

bdrv_dirty_iter_next() returns the next dirty bit (which is not
necessarily the first bit in the cluster).  For the purposes of
serialization, we want to serialize the entire cluster in one go, even
though we will be serializing 0 bits up until the first dirty bit.  So
offset at this point may be unaligned,

>>          uint64_t end, write_size;
>>          int64_t off;
>>
>> -        sector = cluster * sbc;
>> -        end = MIN(bm_sectors, sector + sbc);
>> -        write_size = bdrv_dirty_bitmap_serialization_size(bitmap,
>> -            sector * BDRV_SECTOR_SIZE, (end - sector) * BDRV_SECTOR_SIZE);
>> +        offset = QEMU_ALIGN_DOWN(offset, limit);

and we round it down to the start of the cluster that we will actually
be serializing.

> 
> With the question that I asked in v6, apart from changing the spelling
> to be more explicit, I actually hoped that you would explain why
> aligning down is the right thing to do.
> 
> It looks plausible to me that we can do this in correct code because we
> don't support granularities < 512 bytes (a qemu restriction that is
> written down as a note in the qcow2 spec).

It's not granularities < 512 that we have to worry about, but dirty bits
with an offset < 4M (because we are serializing an entire cluster of
bitmap data, where the first dirty offset is not necessarily aligned to
that large).

> 
> The part that I'm missing yet is why we need to do it. The bitmap
> granularity is also the granularity of bdrv_dirty_iter_next(), so isn't
> offset already aligned and we could even assert that instead of aligning
> down? (As long we enforce our restriction, which we seem to do in
> bitmap_list_load().)

Sadly, a quick:

diff --git i/block/qcow2-bitmap.c w/block/qcow2-bitmap.c
index 302fffd6e1..160f3b99f9 100644
--- i/block/qcow2-bitmap.c
+++ w/block/qcow2-bitmap.c
@@ -1106,6 +1106,7 @@ static uint64_t
*store_bitmap_data(BlockDriverState *bs,
         uint64_t end, write_size;
         int64_t off;

+        assert(QEMU_IS_ALIGNED(offset, limit));
         offset = QEMU_ALIGN_DOWN(offset, limit);
         end = MIN(bm_size, offset + limit);
         write_size = bdrv_dirty_bitmap_serialization_size(bitmap, offset,

does NOT fail iotests 165, which appears to be the only test that
actually hammers on qcow2 bitmaps (changing it to an 'assert(false)'
only shows an effect on 165) - which means our test is NOT exercising
all possible alignments.  And it's python-based, with lame output, which
makes debugging it painful.  But never fear, for v9 I will improve the
test to actually affect the bitmap at a point that would fail with my
temporary assertion in place, and thus proving that we DO need to align
down.  Note that test 165 is testing only a 1G image, but I just showed
that 64k clusters with 64k granularity covers up to 32G of image space
in one cluster of the bitmap, so the test is only covering one cluster
of serialization in the first place, and as written, the test is
dirtying byte 0, which explains why it happens to get an offset aligned
to limit, even though that is not a valid assertion.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]