[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter
From: |
Fiona Ebner |
Subject: |
Re: [PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter |
Date: |
Thu, 25 Jan 2024 13:47:13 +0100 |
User-agent: |
Mozilla Thunderbird |
Am 24.01.24 um 16:03 schrieb Fiona Ebner:
> Am 17.01.24 um 17:07 schrieb Vladimir Sementsov-Ogievskiy:
>> Add a parameter that enables discard-after-copy. That is mostly useful
>> in "push backup with fleecing" scheme, when source is snapshot-access
>> format driver node, based on copy-before-write filter snapshot-access
>> API:
>>
>> [guest] [snapshot-access] ~~ blockdev-backup ~~> [backup target]
>> | |
>> | root | file
>> v v
>> [copy-before-write]
>> | |
>> | file | target
>> v v
>> [active disk] [temp.img]
>>
>> In this case discard-after-copy does two things:
>>
>> - discard data in temp.img to save disk space
>> - avoid further copy-before-write operation in discarded area
>>
>> Note that we have to declare WRITE permission on source in
>> copy-before-write filter, for discard to work.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>
> Ran into another issue when the cluster_size of the fleecing image is
> larger than for the backup target, e.g.
>
>> #!/bin/bash
>> rm /tmp/fleecing.qcow2
>> ./qemu-img create /tmp/disk.qcow2 -f qcow2 1G
>> ./qemu-img create /tmp/fleecing.qcow2 -o cluster_size=2M -f qcow2 1G
>> ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G
>> ./qemu-system-x86_64 --qmp stdio \
>> --blockdev
>> qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \
>> --blockdev
>> qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2,discard=unmap
>> \
>> --blockdev
>> qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \
>> <<EOF
>> {"execute": "qmp_capabilities"}
>> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write",
>> "file": "node0", "target": "node1", "node-name": "node3" } }
>> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access",
>> "file": "node3", "discard": "unmap", "node-name": "snap0" } }
>> {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target":
>> "node2", "sync": "full", "job-id": "backup0", "discard-source": true } }
>> EOF
>
> will fail with
>
>> qemu-system-x86_64: ../util/hbitmap.c:570: hbitmap_reset: Assertion
>> `QEMU_IS_ALIGNED(count, gran) || (start + count == hb->orig_size)' failed.
>
> Backtrace shows the assert happens while discarding, when resetting the
> BDRVCopyBeforeWriteState access_bitmap
> > #6 0x0000555556142a2a in hbitmap_reset (hb=0x555557e01b80, start=0,
> count=1048576) at ../util/hbitmap.c:570
>> #7 0x0000555555f80764 in bdrv_reset_dirty_bitmap_locked
>> (bitmap=0x55555850a660, offset=0, bytes=1048576) at
>> ../block/dirty-bitmap.c:563
>> #8 0x0000555555f807ab in bdrv_reset_dirty_bitmap (bitmap=0x55555850a660,
>> offset=0, bytes=1048576) at ../block/dirty-bitmap.c:570
>> #9 0x0000555555f7bb16 in cbw_co_pdiscard_snapshot (bs=0x5555581a7f60,
>> offset=0, bytes=1048576) at ../block/copy-before-write.c:330
>> #10 0x0000555555f8d00a in bdrv_co_pdiscard_snapshot (bs=0x5555581a7f60,
>> offset=0, bytes=1048576) at ../block/io.c:3734
>> #11 0x0000555555fd2380 in snapshot_access_co_pdiscard (bs=0x5555582b4f60,
>> offset=0, bytes=1048576) at ../block/snapshot-access.c:55
>> #12 0x0000555555f8b65d in bdrv_co_pdiscard (child=0x5555584fe790, offset=0,
>> bytes=1048576) at ../block/io.c:3144
>> #13 0x0000555555f78650 in block_copy_task_entry (task=0x555557f588f0) at
>> ../block/block-copy.c:597
>
> My guess for the cause is that in block_copy_calculate_cluster_size() we
> only look at the target. But now that we need to discard the source,
> we'll also need to consider that for the calculation?
>
Just querying the source and picking the maximum won't work either,
because snapshot-access does not currently implement .bdrv_co_get_info
and because copy-before-write (doesn't implement .bdrv_co_get_info and
is a filter) will just return the info of its file child. But the
discard will go to the target child.
If I do
1. .bdrv_co_get_info in snapshot-access: return info from file child
2. .bdrv_co_get_info in copy-before-write: return maximum cluster_size
from file child and target child
3. block_copy_calculate_cluster_size: return maximum from source and target
then the issue does go away, but I don't know if that's not violating
any assumptions and probably there's a better way to avoid the issue?
Best Regards,
Fiona
[PATCH v2 4/4] iotests: add backup-discard-source, Vladimir Sementsov-Ogievskiy, 2024/01/17
[PATCH v2 2/4] block/copy-before-write: create block_copy bitmap in filter node, Vladimir Sementsov-Ogievskiy, 2024/01/17