qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] migration/block: use blk_pwrite_zeroes for e


From: 858585 jemmy
Subject: Re: [Qemu-devel] [PATCH v2] migration/block: use blk_pwrite_zeroes for each zero cluster
Date: Mon, 10 Apr 2017 10:02:23 +0800

On Mon, Apr 10, 2017 at 9:47 AM, Fam Zheng <address@hidden> wrote:
> On Sat, 04/08 21:29, 858585 jemmy wrote:
>> On Sat, Apr 8, 2017 at 12:52 PM, 858585 jemmy <address@hidden> wrote:
>> > On Fri, Apr 7, 2017 at 6:10 PM, Fam Zheng <address@hidden> wrote:
>> >> On Fri, 04/07 16:44, address@hidden wrote:
>> >>> From: Lidong Chen <address@hidden>
>> >>>
>> >>> BLOCK_SIZE is (1 << 20), qcow2 cluster size is 65536 by default,
>> >>> this maybe cause the qcow2 file size is bigger after migration.
>> >>> This patch check each cluster, use blk_pwrite_zeroes for each
>> >>> zero cluster.
>> >>>
>> >>> Signed-off-by: Lidong Chen <address@hidden>
>> >>> ---
>> >>>  migration/block.c | 37 +++++++++++++++++++++++++++++++++++--
>> >>>  1 file changed, 35 insertions(+), 2 deletions(-)
>> >>>
>> >>> diff --git a/migration/block.c b/migration/block.c
>> >>> index 7734ff7..c32e046 100644
>> >>> --- a/migration/block.c
>> >>> +++ b/migration/block.c
>> >>> @@ -885,6 +885,11 @@ static int block_load(QEMUFile *f, void *opaque, 
>> >>> int version_id)
>> >>>      int64_t total_sectors = 0;
>> >>>      int nr_sectors;
>> >>>      int ret;
>> >>> +    int i;
>> >>> +    int64_t addr_offset;
>> >>> +    uint8_t *buf_offset;
>> >>
>> >> Poor variable names, they are not offset, maybe "cur_addr" and "cur_buf"? 
>> >> And
>> >> they can be moved to the loop block below.
>> > ok, i will change.
>> >
>> >>
>> >>> +    BlockDriverInfo bdi;
>> >>> +    int cluster_size;
>> >>>
>> >>>      do {
>> >>>          addr = qemu_get_be64(f);
>> >>> @@ -934,8 +939,36 @@ static int block_load(QEMUFile *f, void *opaque, 
>> >>> int version_id)
>> >>>              } else {
>> >>>                  buf = g_malloc(BLOCK_SIZE);
>> >>>                  qemu_get_buffer(f, buf, BLOCK_SIZE);
>> >>> -                ret = blk_pwrite(blk, addr * BDRV_SECTOR_SIZE, buf,
>> >>> -                                 nr_sectors * BDRV_SECTOR_SIZE, 0);
>> >>> +
>> >>> +                ret = bdrv_get_info(blk_bs(blk), &bdi);
>> >>> +                cluster_size = bdi.cluster_size;
>> >>> +
>> >>> +                if (ret == 0 && cluster_size > 0 &&
>> >>> +                    cluster_size < BLOCK_SIZE &&
>> >>
>> >> I think cluster_size == BLOCK_SIZE should work too.
>> > This case the (flags & BLK_MIG_FLAG_ZERO_BLOCK) should be true,
>> > and will invoke blk_pwrite_zeroes before apply this patch.
>> > but maybe the source qemu maybe not enabled zero flag.
>> > so i think cluster_size <= BLOCK_SIZE is ok.
>> >
>> >>
>> >>> +                    BLOCK_SIZE % cluster_size == 0) {
>> >>> +                    for (i = 0; i < BLOCK_SIZE / cluster_size; i++) {
>> >>> +                        addr_offset = addr * BDRV_SECTOR_SIZE
>> >>> +                                        + i * cluster_size;
>> >>> +                        buf_offset = buf + i * cluster_size;
>> >>> +
>> >>> +                        if (buffer_is_zero(buf_offset, cluster_size)) {
>> >>> +                            ret = blk_pwrite_zeroes(blk, addr_offset,
>> >>> +                                                    cluster_size,
>> >>> +                                                    BDRV_REQ_MAY_UNMAP);
>> >>> +                        } else {
>> >>> +                             ret = blk_pwrite(blk, addr_offset, 
>> >>> buf_offset,
>> >>> +                                              cluster_size, 0);
>> >>> +                        }
>> >>> +
>> >>> +                        if (ret < 0) {
>> >>> +                            g_free(buf);
>> >>> +                            return ret;
>> >>> +                        }
>> >>> +                    }
>> >>> +                } else {
>> >>> +                    ret = blk_pwrite(blk, addr * BDRV_SECTOR_SIZE, buf,
>> >>> +                                     nr_sectors * BDRV_SECTOR_SIZE, 0);
>> >>> +                }
>> >>>                  g_free(buf);
>> >>>              }
>> >>>
>> >>> --
>> >>> 1.8.3.1
>> >>>
>> >>
>> >> Is it possible use (source) cluster size as the transfer chunk size, 
>> >> instead of
>> >> BDRV_SECTORS_PER_DIRTY_CHUNK? Then the existing BLK_MIG_FLAG_ZERO_BLOCK 
>> >> logic
>> >> can help and you don't need to send zero bytes on the wire. This may 
>> >> still not
>> >> be optimal if dest has larger cluster, but it should cover the common use 
>> >> case
>> >> well.
>> >
>> > yes, i also think BDRV_SECTORS_PER_DIRTY_CHUNK is too large.
>> > This have two disadvantage:
>> > 1. it will cause the dest qcow2 file size is bigger after migration.
>> > 2. it will cause transfer not necessary data, and maybe cause the
>> > migration can't be successful.
>> >
>> > in my production environment, some vm only write 2MB/s, the dirty
>> > block migrate speed is 70MB/s.
>> > but it still migration timeout.
>> >
>> > but if we change the size of BDRV_SECTORS_PER_DIRTY_CHUNK, it will
>> > break the protocol.
>> > the old version qemu will not be able to migrate to new version qemu.
>> > there are not information about the length about the migration buffer.
>> >
>> > so i think we should add new flags to indicate that there are an
>> > additional byte about the length
>> > of migration buffer. i will send another patch later, and test the result.
>>
>> Hi Fam:
>> Do we need consider the circumstances than migrate from new qemu version
>> to old qemu version?
>
> Yes, usually we use a subsection to achieve that - missing the "chunk size"
> field should result in using the old BDRV_SECTORS_PER_DIRTY_CHUNK value.

ok, i will develop a prototype firstly, and send out the performance
test result later.

>
> Fam
>
>>
>> >
>> > this patch is also valuable, there are many old version qemu in my
>> > production environment.
>> > and will be benefit with this patch.
>> >
>> >>
>> >> Fam
>>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]