Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock

From:	Vladimir Sementsov-Ogievskiy
Subject:	Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
Date:	Thu, 20 Feb 2020 10:21:57 +0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1

17.02.2020 16:38, Max Reitz wrote:

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:

Currently, block_copy operation lock the whole requested region. But
there is no reason to lock clusters, which are already copied, it will
disturb other parallel block_copy requests for no reason.

Let's instead do the following:

Lock only sub-region, which we are going to operate on. Then, after
copying all dirty sub-regions, we should wait for intersecting
requests block-copy, if they failed, we should retry these new dirty
clusters.


Just a thought spoken aloud:

I would expect the number of intersecting CBW requests to be low in
general, so I don’t know how useful this change is in practice.  OTOH,
it makes block_copy call the existing implementation in a loop, which
seems just worse.

But then again, in the common case, block_copy_dirty_clusters() won’t
copy anything because it’s all been copied already, so there is no
change; and even if something is copied, the second call will just
re-check the dirty bitmap to see that the area’s clean (which will be
quick compared to the copy operation).  So there’s probably nothing to
worry about.

Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
---
  block/block-copy.c | 116 +++++++++++++++++++++++++++++++++++++--------
  1 file changed, 95 insertions(+), 21 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 20068cd699..aca44b13fb 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -39,29 +39,62 @@ static BlockCopyInFlightReq 
*block_copy_find_inflight_req(BlockCopyState *s,
      return NULL;
  }

-static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,

-                                                       int64_t offset,
-                                                       int64_t bytes)
+/*
+ * If there are no intersecting requests return false. Otherwise, wait for the
+ * first found intersecting request to finish and return true.
+ */
+static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t start,
+                                             int64_t end)


s/end/bytes/?

(And maybe s/start/offset/, too)

  {
-    BlockCopyInFlightReq *req;
+    BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);

- while ((req = block_copy_find_inflight_req(s, offset, bytes))) {

-        qemu_co_queue_wait(&req->wait_queue, NULL);
+    if (!req) {
+        return false;
      }
+
+    qemu_co_queue_wait(&req->wait_queue, NULL);
+
+    return true;
  }

+/* Called only on full-dirty region */

  static void block_copy_inflight_req_begin(BlockCopyState *s,
                                            BlockCopyInFlightReq *req,
                                            int64_t offset, int64_t bytes)
  {
+    assert(!block_copy_find_inflight_req(s, offset, bytes));
+
+    bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
+
      req->offset = offset;
      req->bytes = bytes;
      qemu_co_queue_init(&req->wait_queue);
      QLIST_INSERT_HEAD(&s->inflight_reqs, req, list);
  }

-static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq *req)

+static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
+        BlockCopyInFlightReq *req, int64_t new_bytes)


It took me a while to understand that this is operation drops the tail
of the request.  I think there should be a comment on this.

(I thought it would successively drop the head after each copy, and so I
was wondering why the code didn’t match that.)

  {
+    if (new_bytes == req->bytes) {
+        return;
+    }
+
+    assert(new_bytes > 0 && new_bytes < req->bytes);
+
+    bdrv_set_dirty_bitmap(s->copy_bitmap,
+                          req->offset + new_bytes, req->bytes - new_bytes);> +
+    req->bytes = new_bytes;
+    qemu_co_queue_restart_all(&req->wait_queue);
+}
+
+static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
+                                                     BlockCopyInFlightReq *req,
+                                                     int ret)
+{
+    if (ret < 0) {
+        bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
+    }
      QLIST_REMOVE(req, list);
      qemu_co_queue_restart_all(&req->wait_queue);
  }
@@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
      return ret;
  }

-int coroutine_fn block_copy(BlockCopyState *s,

-                            int64_t offset, uint64_t bytes,
-                            bool *error_is_read)
+/*
+ * block_copy_dirty_clusters
+ *
+ * Copy dirty clusters in @start/@bytes range.
+ * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
+ * clusters found and -errno on failure.
+ */
+static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
+                                                  int64_t offset, int64_t 
bytes,
+                                                  bool *error_is_read)
  {
      int ret = 0;
-    BlockCopyInFlightReq req;
+    bool found_dirty = false;

/*

       * block_copy() user is responsible for keeping source and target in same
@@ -361,10 +401,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
      assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
      assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));

- block_copy_wait_inflight_reqs(s, offset, bytes);

-    block_copy_inflight_req_begin(s, &req, offset, bytes);
-
      while (bytes) {
+        BlockCopyInFlightReq req;
          int64_t next_zero, cur_bytes, status_bytes;

if (!bdrv_dirty_bitmap_get(s->copy_bitmap, offset)) {

@@ -374,6 +412,8 @@ int coroutine_fn block_copy(BlockCopyState *s,
              continue; /* already copied */
          }

+ found_dirty = true;

+
          cur_bytes = MIN(bytes, s->copy_size);

next_zero = bdrv_dirty_bitmap_next_zero(s->copy_bitmap, offset,

@@ -383,10 +423,12 @@ int coroutine_fn block_copy(BlockCopyState *s,
              assert(next_zero < offset + cur_bytes); /* no need to do MIN() */
              cur_bytes = next_zero - offset;
          }
+        block_copy_inflight_req_begin(s, &req, offset, cur_bytes);

ret = block_copy_block_status(s, offset, cur_bytes, &status_bytes);

+        block_copy_inflight_req_shrink(s, &req, status_bytes);


block_copy_inflight_req_shrink() asserts that status_bytes <= cur_bytes.
  That isn’t necessarily correct, as block_copy_block_status() rounds up
on the last cluster.  So this should use the same MIN() as for the
cur_bytes update after the next block.

Would it make sense to move the block_copy_inflight_req_shrink() there
and pass the updated cur_bytes to it?

          if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
-            bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, status_bytes);
+            block_copy_inflight_req_end(s, &req, 0);
              s->progress_reset_callback(s->progress_opaque);
              trace_block_copy_skip_range(s, offset, status_bytes);
              offset += status_bytes;
@@ -398,15 +440,13 @@ int coroutine_fn block_copy(BlockCopyState *s,

trace_block_copy_process(s, offset);- bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);

-
          co_get_from_shres(s->mem, cur_bytes);
          ret = block_copy_do_copy(s, offset, cur_bytes, ret & BDRV_BLOCK_ZERO,
                                   error_is_read);
          co_put_to_shres(s->mem, cur_bytes);
+        block_copy_inflight_req_end(s, &req, ret);
          if (ret < 0) {
-            bdrv_set_dirty_bitmap(s->copy_bitmap, offset, cur_bytes);
-            break;
+            return ret;
          }

s->progress_bytes_callback(cur_bytes, s->progress_opaque);

@@ -414,7 +454,41 @@ int coroutine_fn block_copy(BlockCopyState *s,
          bytes -= cur_bytes;
      }

- block_copy_inflight_req_end(&req);

+    return found_dirty;
+}

- return ret;

+int coroutine_fn block_copy(BlockCopyState *s, int64_t start, uint64_t bytes,
+                            bool *error_is_read)
+{
+    while (true) {
+        int ret = block_copy_dirty_clusters(s, start, bytes, error_is_read);
+
+        if (ret < 0) {
+            /*
+             * IO operation failed, which means the whole block_copy request
+             * failed.
+             */
+            return ret;
+        }
+        if (ret) {
+            /*
+             * Something was copied, which means that there were yield points
+             * and some new dirty bits may appered (due to failed parallel


s/appered/have appeared/

+             * block-copy requests).
+             */
+            continue;
+        }
+
+        /*
+         * Here ret == 0, which means that there is no dirty clusters in
+         * requested region.
+         */
+
+        if (!block_copy_wait_one(s, start, bytes)) {
+            /* No dirty bits and nothing to wait: the whole request is done */


Wouldn’t it make more sense to keep block_copy_wait_one() a loop (i.e.,
keep it as block_copy_wait_inflight_reqs()) that returns whether it
waited or not?  Because I suppose if we had to wait for anything, we
might as well wait for everything in the range.


No, we don't need to wait all. If some dirty bits appeared we may start copy 
them
immediately, not waiting for others.

+            break;
+        }
+    }


Continuing my loud thought from the beginning, I would have written this
as a tail-recursive function to stress that this isn’t really a
(potentially expensive) loop but more of a re-check to be sure.

(i.e.

int ret = block_copy_dirty...();
if (ret < 0) {
     return ret;
}

if (ret || block_copy_wait_one()) {
     /* Something might have changed, re-check */
     return block_copy();
}

/* Done */
return 0;
)

But who cares.


Hm, I'd prefer to avoid recursion. But combining two if operators seems good 
anyway.


With this patch I have two aims:

1. May be you are right and we don't have too much intersecting requests.. 
Actually, the only
case is intersection of guest write with block-job block_copy call, as 
intersection of
two job requests is not possible and intersection of guest writes is not normal 
thing. But anyway,
it should work better for intersecting requests, a bit more interactive.

2. As block-status handling and aio-tasking should be done here in block-copy, 
there is no reason
to keep a loop in backup.c. So at the end of the original series only one call 
block_copy(0, disk_size)
is left in backup.c. And this definitely needs this patch. So, it's an 
architectural patch, which
makes block_copy to be more universal thing.

--
Best regards,
Vladimir

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock, Max Reitz, 2020/02/17
- Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock, Vladimir Sementsov-Ogievskiy <=
  - Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock, Max Reitz, 2020/02/20

Prev by Date: [Bug 1823790] Re: QEMU mishandling of SO_PEERSEC forces systemd into tight loop
Next by Date: Re: [PATCH v2 7/7] block/block-copy: hide structure definitions
Previous by thread: Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
Next by thread: Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock
Index(es):
- Date
- Thread