qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v4] migration/block: use blk_pwrite_zeroes for e


From: Stefan Hajnoczi
Subject: Re: [Qemu-block] [PATCH v4] migration/block: use blk_pwrite_zeroes for each zero cluster
Date: Tue, 11 Apr 2017 16:59:58 +0100
User-agent: Mutt/1.8.0 (2017-02-23)

On Tue, Apr 11, 2017 at 08:05:12PM +0800, address@hidden wrote:
> From: Lidong Chen <address@hidden>
> 
> BLOCK_SIZE is (1 << 20), qcow2 cluster size is 65536 by default,
> this maybe cause the qcow2 file size is bigger after migration.
> This patch check each cluster, use blk_pwrite_zeroes for each
> zero cluster.
> 
> Signed-off-by: Lidong Chen <address@hidden>
> ---
>  migration/block.c | 33 +++++++++++++++++++++++++++++++--
>  1 file changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/block.c b/migration/block.c
> index 7734ff7..5d0635a 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -885,6 +885,8 @@ static int block_load(QEMUFile *f, void *opaque, int 
> version_id)
>      int64_t total_sectors = 0;
>      int nr_sectors;
>      int ret;
> +    BlockDriverInfo bdi;
> +    int cluster_size;
>  
>      do {
>          addr = qemu_get_be64(f);
> @@ -919,6 +921,15 @@ static int block_load(QEMUFile *f, void *opaque, int 
> version_id)
>                      error_report_err(local_err);
>                      return -EINVAL;
>                  }
> +
> +                ret = bdrv_get_info(blk_bs(blk), &bdi);
> +                if (ret == 0 && bdi.cluster_size > 0 &&
> +                    bdi.cluster_size <= BLOCK_SIZE &&
> +                    BLOCK_SIZE % bdi.cluster_size == 0) {
> +                    cluster_size = bdi.cluster_size;
> +                } else {
> +                    cluster_size = BLOCK_SIZE;

This is a nice trick to unify code paths.  It has a disadvantage though:

If the "zero blocks" migration capability is enabled and the drive has
no cluster_size (e.g. raw files), then the source QEMU process has
already scanned for zeroes.  CPU is wasted scanning for zeroes in the
destination QEMU process.

Given that disk images can be large we should probably avoid unnecessary
scanning.  This is especially true because there's no other reason
(besides zero detection) to pollute the CPU cache with data from the
disk image.

In other words, we should only scan for zeroes when
!block_mig_state.zero_blocks || cluster_size < BLOCK_SIZE.

> +                }
>              }
>  
>              if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
> @@ -932,10 +943,28 @@ static int block_load(QEMUFile *f, void *opaque, int 
> version_id)
>                                          nr_sectors * BDRV_SECTOR_SIZE,
>                                          BDRV_REQ_MAY_UNMAP);
>              } else {
> +                int i;
> +                int64_t cur_addr;
> +                uint8_t *cur_buf;
> +
>                  buf = g_malloc(BLOCK_SIZE);
>                  qemu_get_buffer(f, buf, BLOCK_SIZE);
> -                ret = blk_pwrite(blk, addr * BDRV_SECTOR_SIZE, buf,
> -                                 nr_sectors * BDRV_SECTOR_SIZE, 0);
> +                for (i = 0; i < BLOCK_SIZE / cluster_size; i++) {
> +                    cur_addr = addr * BDRV_SECTOR_SIZE + i * cluster_size;
> +                    cur_buf = buf + i * cluster_size;
> +
> +                    if (buffer_is_zero(cur_buf, cluster_size)) {
> +                        ret = blk_pwrite_zeroes(blk, cur_addr,
> +                                                cluster_size,
> +                                                BDRV_REQ_MAY_UNMAP);
> +                    } else {
> +                        ret = blk_pwrite(blk, cur_addr, cur_buf,
> +                                         cluster_size, 0);
> +                    }
> +                    if (ret < 0) {
> +                        break;
> +                    }
> +                }
>                  g_free(buf);
>              }

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]