Re: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansio

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansio

From:	Eric Blake
Subject:	Re: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion
Date:	Wed, 30 Jul 2014 10:14:04 -0600
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0

On 07/25/2014 12:07 PM, Max Reitz wrote:
> Actually, we do not need to allocate a new data cluster for every zero
> cluster to be expanded: It is completely sufficient to rely on qcow2's
> COW part and instead create a single zero cluster and reuse it as much
> as possible.
> 
> Signed-off-by: Max Reitz <address@hidden>
> ---
>  block/qcow2-cluster.c | 119 
> ++++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 92 insertions(+), 27 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 905beb6..867db03 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1558,6 +1558,9 @@ static int expand_zero_clusters_in_l1(BlockDriverState 
> *bs, uint64_t *l1_table,
>      BDRVQcowState *s = bs->opaque;
>      bool is_active_l1 = (l1_table == s->l1_table);
>      uint64_t *l2_table = NULL;
> +    int64_t zeroed_cluster_offset = 0;
> +    int zeroed_cluster_refcount = 0;
> +    int last_zeroed_cluster_l1i = 0, last_zeroed_cluster_l2i = 0;
>      int ret;
>      int i, j;
>  
> @@ -1617,47 +1620,79 @@ static int 
> expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
>                      continue;
>                  }
>  
> -                offset = qcow2_alloc_clusters(bs, s->cluster_size);
> -                if (offset < 0) {
> -                    ret = offset;
> -                    goto fail;
> +                if (zeroed_cluster_offset) {
> +                    zeroed_cluster_refcount += l2_refcount;
> +                    if (zeroed_cluster_refcount > 0xffff) {

Doesn't the qcow2 file format allow variable-sized maximum refcount
(bytes 96-99 refcount_order in the header)?  Therefore, you should be
using the value computed from the header rather than hard-coding the
assumption that the header used (the default of) 16-bit refcount. [Yeah,
I know, we don't yet have code that supports non-default size, even
though the file format documents it, but that doesn't mean we should
make it harder to add support down the road...]

> +                        zeroed_cluster_refcount = 0;
> +                        zeroed_cluster_offset = 0;
> +                    }
>                  }

This isn't a maximal packing.  As long as we don't mind complexity to
gain compactness, couldn't we also expand the existing
zeroed_cluster_offset all the way up to full refcount, and decrement
l2_refcount by the difference, before spilling over to allocating the
next zero cluster?

Also, I have to wonder - since the all-zero cluster is the most likely
cluster to have a large refcount, even during normal runtime, should we
special case the normal qcow2 write code to track the current all-zero
cluster (if any), and merely increase its refcount rather than allocate
a new cluster any time it is detected that an all-zero cluster is
needed?  [Of course, the tracking would be runtime only, since
compat=0.10 header doesn't provide any way to track the location of an
all-zero cluster across file reloads.  Each new runtime would probably
settle on a new location for the all-zero cluster used during that run,
rather than trying to find an existing one.  And there's really no point
to adding a header to track an all-zero cluster in compat=1.1 images,
since those images already have the ability to track zero clusters
without needing one allocated.]

> +                if (!zeroed_cluster_offset) {
> +                    offset = qcow2_alloc_clusters(bs, s->cluster_size);
> +                    if (offset < 0) {
> +                        ret = offset;
> +                        goto fail;
> +                    }
>  
> -                if (l2_refcount > 1) {
> -                    /* For shared L2 tables, set the refcount accordingly 
> (it is
> -                     * already 1 and needs to be l2_refcount) */
> -                    ret = qcow2_update_cluster_refcount(bs,
> -                            offset >> s->cluster_bits, l2_refcount - 1,
> -                            QCOW2_DISCARD_OTHER);
> +                    ret = qcow2_pre_write_overlap_check(bs, 0, offset,
> +                                                        s->cluster_size);
> +                    if (ret < 0) {
> +                        qcow2_free_clusters(bs, offset, s->cluster_size,
> +                                            QCOW2_DISCARD_OTHER);
> +                        goto fail;
> +                    }
> +
> +                    ret = bdrv_write_zeroes(bs->file, offset / 
> BDRV_SECTOR_SIZE,
> +                                            s->cluster_sectors, 0);

That is, if bdrv_write_zeroes knows how to take advantage of an already
existing all-zero cluster, it would be less special casing in this code,
but still get the same benefits of maximizing refcount during the amend
operation, if all expanded clusters go through bdrv_write_zeroes.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH 2/8] qemu-img: Add progress output for amend, (continued)
- [Qemu-devel] [PATCH 3/8] qemu-img: Fix insignifcant memleak, Max Reitz, 2014/07/25
  - Re: [Qemu-devel] [PATCH 3/8] qemu-img: Fix insignifcant memleak, Eric Blake, 2014/07/30
- [Qemu-devel] [PATCH 4/8] block/qcow2: Make get_refcount() global, Max Reitz, 2014/07/25
  - Re: [Qemu-devel] [PATCH 4/8] block/qcow2: Make get_refcount() global, Eric Blake, 2014/07/30
- [Qemu-devel] [PATCH 5/8] block/qcow2: Implement status CB for amend, Max Reitz, 2014/07/25
  - Re: [Qemu-devel] [PATCH 5/8] block/qcow2: Implement status CB for amend, Eric Blake, 2014/07/30
- [Qemu-devel] [PATCH 6/8] block/qcow2: Simplify shared L2 handling in amend, Max Reitz, 2014/07/25
  - Re: [Qemu-devel] [PATCH 6/8] block/qcow2: Simplify shared L2 handling in amend, Eric Blake, 2014/07/30
- [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion, Max Reitz, 2014/07/25
  - Re: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion, Eric Blake <=
    - Re: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion, Max Reitz, 2014/07/30
    - Re: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion, Eric Blake, 2014/07/30
    - Re: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion, Max Reitz, 2014/07/30
- [Qemu-devel] [PATCH 8/8] iotests: Expand test 061, Max Reitz, 2014/07/25
- [Qemu-devel] [PATCH 1/8] block: Add status callback to bdrv_amend_options(), Max Reitz, 2014/07/25
  - Re: [Qemu-devel] [PATCH 1/8] block: Add status callback to bdrv_amend_options(), Eric Blake, 2014/07/30

Prev by Date: Re: [Qemu-devel] [PATCH v2] Add ACPI tables for TPM
Next by Date: Re: [Qemu-devel] [PATCH v2] Add ACPI tables for TPM
Previous by thread: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion
Next by thread: Re: [Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion
Index(es):
- Date
- Thread