qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 4/5] qcow2: Don't allow overflow during clust


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v5 4/5] qcow2: Don't allow overflow during cluster allocation
Date: Wed, 25 Apr 2018 16:44:12 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0

On 2018-04-24 00:33, Eric Blake wrote:
> Our code was already checking that we did not attempt to
> allocate more clusters than what would fit in an INT64 (the
> physical maximimum if we can access a full off_t's worth of
> data).  But this does not catch smaller limits enforced by
> various spots in the qcow2 image description: L1 and normal
> clusters of L2 are documented as having bits 63-56 reserved
> for other purposes, capping our maximum offset at 64PB (bit
> 55 is the maximum bit set).  And for compressed images with
> 2M clusters, the cap drops the maximum offset to bit 48, or
> a maximum offset of 512TB.  If we overflow that offset, we
> would write compressed data into one place, but try to
> decompress from another, which won't work.
> 
> I don't have 512TB handy to prove whether things break if we
> compress so much data that we overflow that limit, and don't
> think that iotests can (quickly) test it either.  Test 138
> comes close (it corrupts an image into thinking something lives
> at 32PB, which is half the maximum for L1 sizing - although
> it relies on 512-byte clusters).  But that test points out
> that we will generally hit other limits first (such as running
> out of memory for the refcount table, or exceeding file system
> limits like 16TB on ext4, etc), so this is more a theoretical
> safety valve than something likely to be hit.

You don't need 512 TB, though, 36 MB is sufficient.

Here's what you do:
(1) Create a 513 TB image with cluster_size=2M,refcount_bits=1
(2) Take a hex editor and enter 16 refblocks into the reftable
(3) Fill all of those refblocks with 1s

(Funny side note: qemu-img check thinks that image is clean because it
doesn't check refcounts beyond the image end...)

I've attached a compressed test image (unsurprisingly, it compresses
really well).

Before this series:
$ ./qemu-io -c 'write -c 0 2M' test.qcow2
qcow2: Marking image as corrupt: Preventing invalid write on metadata
(overlaps with refcount block); further corruption events will be suppressed
write failed: Input/output error

Aw.

After this series:
$ ./qemu-io -c 'write -c 0 2M' test.qcow2
write failed: Input/output error

(Normal writes just work fine.)


Maybe you want to add a test still -- creating the image is rather quick
(well, you have to write 64 MB of 1s, but other than that).  The only
thing that takes a bit of time is qemu figuring out where the first free
cluster is...  That takes like 15 seconds here.

And another issue of course is...

$ ls -lhs test.qcow2
42M -rw-r--r--. 1 maxx maxx 513T 25. Apr 16:42 test.qcow2

Yeah, that.  Depends on the host file system, of course, whether that is
a real issue. O:-)

Max

> Signed-off-by: Eric Blake <address@hidden>
> Reviewed-by: Alberto Garcia <address@hidden>
> 
> ---
> v3: use s->cluster_offset_mask instead of open-coding it [Berto],
> use MIN() to clamp offset on small cluster size, add spec patch
> first to justify clamping even on refcount allocations
> ---
>  block/qcow2.h          |  6 ++++++
>  block/qcow2-refcount.c | 21 ++++++++++++++-------
>  2 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 1df15a18aa1..a3b9faa9d53 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -41,6 +41,12 @@
>  #define QCOW_MAX_CRYPT_CLUSTERS 32
>  #define QCOW_MAX_SNAPSHOTS 65536
> 
> +/* Field widths in qcow2 mean normal cluster offsets cannot reach
> + * 64PB; depending on cluster size, compressed clusters can have a
> + * smaller limit (64PB for up to 16k clusters, then ramps down to
> + * 512TB for 2M clusters).  */
> +#define QCOW_MAX_CLUSTER_OFFSET ((1ULL << 56) - 1)
> +
>  /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
>   * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
>  #define QCOW_MAX_REFTABLE_SIZE 0x800000
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 6bfc11bb48f..fcbea3c9644 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -31,7 +31,8 @@
>  #include "qemu/bswap.h"
>  #include "qemu/cutils.h"
> 
> -static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
> +static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size,
> +                                    uint64_t max);
>  static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
>                              int64_t offset, int64_t length, uint64_t addend,
>                              bool decrease, enum qcow2_discard_type type);
> @@ -362,7 +363,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
>      }
> 
>      /* Allocate the refcount block itself and mark it as used */
> -    int64_t new_block = alloc_clusters_noref(bs, s->cluster_size);
> +    int64_t new_block = alloc_clusters_noref(bs, s->cluster_size,
> +                                             QCOW_MAX_CLUSTER_OFFSET);
>      if (new_block < 0) {
>          return new_block;
>      }
> @@ -954,7 +956,8 @@ int qcow2_update_cluster_refcount(BlockDriverState *bs,
> 
> 
>  /* return < 0 if error */
> -static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size)
> +static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size,
> +                                    uint64_t max)
>  {
>      BDRVQcow2State *s = bs->opaque;
>      uint64_t i, nb_clusters, refcount;
> @@ -979,9 +982,9 @@ retry:
>      }
> 
>      /* Make sure that all offsets in the "allocated" range are representable
> -     * in an int64_t */
> +     * in the requested max */
>      if (s->free_cluster_index > 0 &&
> -        s->free_cluster_index - 1 > (INT64_MAX >> s->cluster_bits))
> +        s->free_cluster_index - 1 > (max >> s->cluster_bits))
>      {
>          return -EFBIG;
>      }
> @@ -1001,7 +1004,7 @@ int64_t qcow2_alloc_clusters(BlockDriverState *bs, 
> uint64_t size)
> 
>      BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC);
>      do {
> -        offset = alloc_clusters_noref(bs, size);
> +        offset = alloc_clusters_noref(bs, size, QCOW_MAX_CLUSTER_OFFSET);
>          if (offset < 0) {
>              return offset;
>          }
> @@ -1083,7 +1086,11 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int 
> size)
>      free_in_cluster = s->cluster_size - offset_into_cluster(s, offset);
>      do {
>          if (!offset || free_in_cluster < size) {
> -            int64_t new_cluster = alloc_clusters_noref(bs, s->cluster_size);
> +            int64_t new_cluster;
> +
> +            new_cluster = alloc_clusters_noref(bs, s->cluster_size,
> +                                               MIN(s->cluster_offset_mask,
> +                                                   QCOW_MAX_CLUSTER_OFFSET));
>              if (new_cluster < 0) {
>                  return new_cluster;
>              }
> 

Attachment: test.qcow2.xz
Description: application/xz

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]