[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 3/3] block: mirror - zero unallocated target sec
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present |
Date: |
Mon, 28 Sep 2015 17:07:52 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Am 28.09.2015 um 05:29 hat Jeff Cody geschrieben:
> During mirror, if the target device does not have support zero
> initialization, a mirror may result in a corrupt image.
I think you want to check this sentence. ("During mirror [...], a
mirror may result [...]")
> For instance, on mirror to a host device with format = raw, whatever
> random data is on the target device will still be there for unallocated
> sectors.
>
> This is because during the mirror, we set the dirty bitmap to copy only
> sectors allocated above 'base'. In the case of target devices where we
> cannot assume unallocated sectors will be read as zeroes, we need to
> explicitely zero out this data.
>
> In order to avoid zeroing out all sectors of the target device prior to
> mirroring, we do zeroing as part of the block job. A second dirty
> bitmap cache is created, to track sectors that are unallocated above
> 'base'. These sectors are then checked for status of BDRV_BLOCK_ZERO
> on the target - if they are not, then zeroes are explicitly written.
Why do you need a bitmap? You never change the bitmap after initialising
it, so couldn't you instead just check the allocation status when you
need it?
In fact, why do we need two passes? I would have expected that commit
dcfb3beb already does the trick, with checking allocation status and
writing zeroes during the normal single pass.
If that commit fails to solve the problem, I guess I first need to
understand why before I can continue reviewing this one...
> This only occurs under two conditions:
>
> 1. 'mode' != "existing"
> 2. bdrv_has_zero_init(target) == NULL
>
> We perform the mirroring through mirror_iteration() as before, except
> in two passes. If the above two conditions are met, the first pass
> is using the bitmap tracking unallocated sectors, to write the needed
> zeroes. Then, the second pass is performed, to mirror the actual data
> as before.
>
> If the above two conditions are not met, then the first pass is skipped,
> and only the second pass (the one with the actual data) is performed.
>
> Signed-off-by: Jeff Cody <address@hidden>
> ---
> block/mirror.c | 109
> ++++++++++++++++++++++++++++++++++------------
> blockdev.c | 2 +-
> include/block/block_int.h | 3 +-
> qapi/block-core.json | 6 ++-
> 4 files changed, 87 insertions(+), 33 deletions(-)
>
> diff --git a/block/mirror.c b/block/mirror.c
> index 405e5c4..b599176 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -50,7 +50,9 @@ typedef struct MirrorBlockJob {
> int64_t bdev_length;
> unsigned long *cow_bitmap;
> BdrvDirtyBitmap *dirty_bitmap;
> - HBitmapIter hbi;
> + HBitmapIter zero_hbi;
> + HBitmapIter allocated_hbi;
> + HBitmapIter *hbi;
> uint8_t *buf;
> QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
> int buf_free_count;
> @@ -60,6 +62,8 @@ typedef struct MirrorBlockJob {
> int sectors_in_flight;
> int ret;
> bool unmap;
> + bool zero_unallocated;
> + bool zero_cycle;
> bool waiting_for_io;
> } MirrorBlockJob;
>
> @@ -166,10 +170,10 @@ static uint64_t coroutine_fn
> mirror_iteration(MirrorBlockJob *s)
> int pnum;
> int64_t ret;
>
> - s->sector_num = hbitmap_iter_next(&s->hbi);
> + s->sector_num = hbitmap_iter_next(s->hbi);
> if (s->sector_num < 0) {
> - bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
> - s->sector_num = hbitmap_iter_next(&s->hbi);
> + bdrv_dirty_iter_init(s->dirty_bitmap, s->hbi);
> + s->sector_num = hbitmap_iter_next(s->hbi);
> trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
> assert(s->sector_num >= 0);
> }
> @@ -287,7 +291,7 @@ static uint64_t coroutine_fn
> mirror_iteration(MirrorBlockJob *s)
> */
> if (next_sector > hbitmap_next_sector
> && bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
> - hbitmap_next_sector = hbitmap_iter_next(&s->hbi);
> + hbitmap_next_sector = hbitmap_iter_next(s->hbi);
> }
>
> next_sector += sectors_per_chunk;
> @@ -300,25 +304,34 @@ static uint64_t coroutine_fn
> mirror_iteration(MirrorBlockJob *s)
> s->sectors_in_flight += nb_sectors;
> trace_mirror_one_iteration(s, sector_num, nb_sectors);
>
> - ret = bdrv_get_block_status_above(source, NULL, sector_num,
> - nb_sectors, &pnum);
> - if (ret < 0 || pnum < nb_sectors ||
> - (ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
> - bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
> - mirror_read_complete, op);
> - } else if (ret & BDRV_BLOCK_ZERO) {
> - bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> - s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> - mirror_write_complete, op);
> + if (s->zero_cycle) {
> + ret = bdrv_get_block_status(s->target, sector_num, nb_sectors,
> &pnum);
> + if (!(ret & BDRV_BLOCK_ZERO)) {
> + bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> + s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> + mirror_write_complete, op);
> + }
It seems to be expected that this function always involves an AIO
request and the completion event is what helps making progress. For the
BDRV_BLOCK_ZERO case, we don't do that however. I'm not sure what
exactly this means, but at least I think we are applying block job
throttling to doing nothing with some areas of the image.
> } else {
> - assert(!(ret & BDRV_BLOCK_DATA));
> - bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
> - mirror_write_complete, op);
> + ret = bdrv_get_block_status_above(source, NULL, sector_num,
> + nb_sectors, &pnum);
> + if (ret < 0 || pnum < nb_sectors ||
> + (ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
> + bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
> + mirror_read_complete, op);
> + } else if (ret & BDRV_BLOCK_ZERO) {
> + bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> + s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> + mirror_write_complete, op);
> + } else {
> + assert(!(ret & BDRV_BLOCK_DATA));
> + bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
> + mirror_write_complete, op);
> + }
> }
> return delay_ns;
> }
Kevin
- Re: [Qemu-devel] [Qemu-block] [PATCH 1/3] block: allow creation of detached dirty bitmaps, (continued)
- [Qemu-devel] [PATCH 2/3] block: mirror - split out part of mirror_run(), Jeff Cody, 2015/09/27
- [Qemu-devel] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present, Jeff Cody, 2015/09/27
- Re: [Qemu-devel] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present,
Kevin Wolf <=
- Re: [Qemu-devel] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present, Kevin Wolf, 2015/09/29
- Re: [Qemu-devel] [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present, Stefan Hajnoczi, 2015/09/29
- Re: [Qemu-devel] [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present, Max Reitz, 2015/09/29