qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/2] qcow2: handle discard-no-unref in measure


From: Hanna Czenczek
Subject: Re: [PATCH v2 1/2] qcow2: handle discard-no-unref in measure
Date: Wed, 10 Jul 2024 14:58:57 +0200
User-agent: Mozilla Thunderbird

On 05.06.24 15:25, Jean-Louis Dupond wrote:
When doing a measure on an image with a backing file and
discard-no-unref is enabled, the code should take this into account.

That doesn’t make sense to me.  As far as I understand, 'measure' is supposed to report how much space you need for a given image, i.e. if you were to convert it to a new image.  discard-no-unref doesn’t factor into that, because for a 'convert' target (a new image), nothing can be discarded.

Reading the issue, I understand that oVirt uses measure to determine the size of the target of a 'commit' operation.  Seems a bit like abuse to me, precisely because of the issue you’re facing.  More specifically, a 'commit' operation is a complex thing with a lot of variables, so the outcome depends on a lot.

For example, this patch just checks the discard-no-unref setting on the top image.  But AFAIU it doesn’t matter what the setting on the top image is, it matters what the setting on the commit target is. 'measure' can’t know this because it doesn’t know what the commit target is.  As far as I can see, this patch actually assumes the commit target is the first backing image (it specifically checks in the image whether a block is allocated) – why?

So to me that means if 'measure' is supposed to give reliable data on the commit case, it needs to be extended.  Best thing I can come up with off the top of my head would be to add an option e.g. 'commit=<target-node-name>', so we (A) that we’re looking at a commit and not a convert, and (B) we know what data will be collapsed into which image and where we need to check for discard-no-unref.

Hanna

If for example you have a snapshot image with a base, and you do a
discard within the snapshot, it will be ZERO and ALLOCATED, but without
host offset.
Now if we commit this snapshot, and the clusters in the base image have
a host offset, the clusters will only be set to ZERO, but the host offset
will not be cleared.
Therefor non-data clusters in the top image need to check the
base to see if space will be freed or not, to have a correct measure
output.

Bug-Url: https://gitlab.com/qemu-project/qemu/-/issues/2369
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
---
  block/qcow2.c | 32 +++++++++++++++++++++++++++++---
  1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 956128b409..50354e5b98 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -5163,9 +5163,16 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, 
BlockDriverState *in_bs,
          } else {
              int64_t offset;
              int64_t pnum = 0;
+            BlockDriverState *parent = bdrv_filter_or_cow_bs(in_bs);
+            BDRVQcow2State *s = NULL;
+
+            if (parent) {
+                s = parent->opaque;
+            }
for (offset = 0; offset < ssize; offset += pnum) {
                  int ret;
+                int retp = 0;
ret = bdrv_block_status_above(in_bs, NULL, offset,
                                                ssize - offset, &pnum, NULL,
@@ -5176,10 +5183,29 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, 
BlockDriverState *in_bs,
                      goto err;
                  }
- if (ret & BDRV_BLOCK_ZERO) {
+                /* If we have a parent in the chain and the current block is 
not data,
+                 * then we want to check the allocation state of the parent 
block.
+                 * If it has a valid offset, then we want to include it into
+                 * the calculation, cause blocks with an offset will not be 
freed when
+                 * committing the top into base with discard-no-unref enabled.
+                 */
+                if (parent && s->discard_no_unref && !(ret & BDRV_BLOCK_DATA)) 
{
+                        int64_t pnum_parent = 0;
+                        retp = bdrv_block_status_above(parent, NULL, offset,
+                                              ssize - offset, &pnum_parent, 
NULL,
+                                              NULL);
+                        /* If the parent continuous block is smaller, use that 
pnum,
+                         * so the next iteration starts with the smallest 
offset.
+                         */
+                        if (pnum_parent < pnum) {
+                            pnum = pnum_parent;
+                        }
+                }
+                if (ret & BDRV_BLOCK_ZERO && !parent && !(parent && 
s->discard_no_unref)) {
                      /* Skip zero regions (safe with no backing file) */
-                } else if ((ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED)) ==
-                           (BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED)) {
+                } else if (((ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED)) ==
+                            (BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED)) ||
+                           (retp & BDRV_BLOCK_OFFSET_VALID)) {
                      /* Extend pnum to end of cluster for next iteration */
                      pnum = ROUND_UP(offset + pnum, cluster_size) - offset;




reply via email to

[Prev in Thread] Current Thread [Next in Thread]