qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH v2 2/8] nbd: allow reconnect on open, with corresponding new options
Date: Fri, 22 Jan 2021 13:56:48 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1

21.01.2021 04:44, Eric Blake wrote:
On 11/30/20 7:40 AM, Vladimir Sementsov-Ogievskiy wrote:
Note: currently, using new option with long timeout in qmp command
blockdev-add is not good idea, as qmp interface is blocking, so,
don't add it now, let's add it later after
"monitor: Optionally run handlers in coroutines" series merged.

If I'm not mistaken, that landed as of eb94b81a94.  Is it just the
commit message that needs an update, or does this patch need a respin?

Oh yes, you are right. I think the most reasonable thing is to keep this patch
in separate (for simple backporting to downstream without Kevin's series), and
add qmp support for the feature as additional new patch. Will do it on respin.



Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
  block/nbd.c | 115 +++++++++++++++++++++++++++++++++++++++++-----------
  1 file changed, 92 insertions(+), 23 deletions(-)


@@ -474,6 +484,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error 
**errp)
      s->wait_connect = true;
      qemu_coroutine_yield();
+ if (!s->connect_thread) {
+        error_setg(errp, "Connection attempt cancelled by other operation");
+        return NULL;
+    }

Does this need to use atomics for proper access to s->connect_thread
across threads?  Or are all the operations done by other coroutines but
within the same thread, so we are safe?

s->connect_thread is not accessed from connect_thread_func, so in this way we 
are safe. And variables shared between connect_thread_func and other driver code 
are protected by mutex.

What about accessing nbd bds from different threads.. In my observation, all the 
code is written in assumption that everything inside block-driver may be called 
from different coroutines but from one thread.. And we have a lot of s->* 
variables that are not atomic and not protected by mutexes, and all this works 
somehow:)

I remember Paolo answered me somewhere in mailing list, that actually, 
everything in block drivers and block/io must be thread-safe.. But I don't see 
this thread-safety in current code, so don't introduce it for new variables.



@@ -624,10 +645,15 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
      bdrv_inc_in_flight(s->bs);
out:
-    s->connect_status = ret;
-    error_free(s->connect_err);
-    s->connect_err = NULL;
-    error_propagate(&s->connect_err, local_err);
+    if (s->connect_status == -ETIMEDOUT) {
+        /* Don't rewrite timeout error by following cancel-provoked error */

Maybe:

/* Don't propagate a timeout error caused by a job cancellation. */

No, we want to keep ETIMEOUT



+static void open_timer_cb(void *opaque)
+{
+    BDRVNBDState *s = opaque;
+
+    if (!s->connect_status) {
+        /* First attempt was not finished. We should set an error */
+        s->connect_status = -ETIMEDOUT;
+        error_setg(&s->connect_err, "First connection attempt is cancelled by "
+                   "timeout");
+    }
+
+    nbd_teardown_connection_async(s->bs);
+    open_timer_del(s);
+}
+
+static void open_timer_init(BDRVNBDState *s, uint64_t expire_time_ns)
+{
+    assert(!s->open_timer && s->state == NBD_CLIENT_OPENING);
+    s->open_timer = aio_timer_new(bdrv_get_aio_context(s->bs),
+                                  QEMU_CLOCK_REALTIME,
+                                  SCALE_NS,
+                                  open_timer_cb, s);
+    timer_mod(s->open_timer, expire_time_ns);
+}
+


@@ -2180,6 +2235,14 @@ static QemuOptsList nbd_runtime_opts = {
                      "future requests before a successful reconnect will "
                      "immediately fail. Default 0",
          },
+        {
+            .name = "open-timeout",
+            .type = QEMU_OPT_NUMBER,
+            .help = "In seconds. If zero, nbd driver tries to establish "
+                    "connection only once, on fail open fails. If non-zero, "

If zero, the nbd driver tries the connection only once, and fails to
open if the connection fails.

+                    "nbd driver may do several attempts until success or "
+                    "@open-timeout seconds passed. Default 0",

If non-zero, the nbd driver will repeat connection attempts until
successful or until @open-timeout seconds have elapsed.

+        },

Where is the QMP counterpart for setting this option?

Absent (as described in commit msg). Will do in a separate patch.


          { /* end of list */ }
      },
  };
@@ -2235,6 +2298,7 @@ static int nbd_process_options(BlockDriverState *bs, 
QDict *options,
      }
s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 0);
+    s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0);
ret = 0; @@ -2268,6 +2332,11 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
      bdrv_inc_in_flight(bs);
      aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co);
+ if (s->open_timeout) {
+        open_timer_init(s, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
+                        s->open_timeout * NANOSECONDS_PER_SECOND);
+    }
+
      if (qemu_in_coroutine()) {
          s->open_co = qemu_coroutine_self();
          qemu_coroutine_yield();




--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]