Re: [Qemu-block] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block

From:	Eric Blake
Subject:	Re: [Qemu-block] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part
Date:	Tue, 13 Mar 2018 08:47:17 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal realization: only one extent in server answer is supported.

Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
---

v2: - constants and type defs were splitted out by Eric, except for
     NBD_META_ID_BASE_ALLOCATION

The constant for NBD_META_ID_BASE_ALLOCATION was intentionally not splitout; it is the only constant that is relevant only to the server side ;)In fact,...

     - add nbd_opt_skip, to skip meta query remainder, if we are already sure,
     that the query selects nothing
     - check meta export name in OPT_EXPORT_NAME and OPT_GO
     - always set context_id = 0 for NBD_OPT_LIST_META_CONTEXT
     - negotiation rewritten to avoid wasting time and memory on reading long,
     obviously invalid meta queries
     - fixed ERR_INVALID->ERR_UNKNOWN if export not found in 
nbd_negotiate_meta_queries
     - check client->export_meta.valid in "case NBD_CMD_BLOCK_STATUS"


  include/block/nbd.h |   2 +
  nbd/server.c        | 310 ++++++++++++++++++++++++++++++++++++++++++++++++++++
  2 files changed, 312 insertions(+)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2285637e67..9f2be18186 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -188,6 +188,8 @@ typedef struct NBDExtent {
  #define NBD_CMD_FLAG_REQ_ONE    (1 << 3) /* only one extent in BLOCK_STATUS
                                            * reply chunk */

+#define NBD_META_ID_BASE_ALLOCATION 0


...I will be squashing in a change to move it out of the .h and into the .c.

  /* Supported request types */
  enum {
      NBD_CMD_READ = 0,
diff --git a/nbd/server.c b/nbd/server.c
index 085e14afbf..16d7388085 100644
--- a/nbd/server.c

@@ -371,6 +396,12 @@ static int nbd_negotiate_handle_list(NBDClient *client, 
Error **errp)
      return nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
  }

+static void nbd_check_meta_export_name(NBDClient *client)

+{
+    client->export_meta.valid = client->export_meta.valid &&
+        strcmp(client->exp->name, client->export_meta.export_name) == 0;

The indentation makes this harder for me to parse (at first glance, Ithought you had (a) && (b), and were either missing a side effect orreturn statement). It's a lot more obvious what you are doing with:


client->export_meta.valid &= !strcmp(client->exp->name,
                                     client->export_meta.export_name);

+/* nbd_meta_base_query
+ *
+ * Handle query to 'base' namespace. For now, only base:allocation context is
+ * available in it.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
+                               uint32_t len, Error **errp)

The comments don't describe what 'len' represents, I had to go read thecall-site before I could understand this function. If I understandcorrectly, this function is called at the point that we have parsed"base:" out of the longer overall name given to LIST or SET, and len isthe remaining length of the overall name that still needs parsing.

+{
+    int ret;
+    char query[sizeof("allocation") - 1];

Why discard the trailing NUL from the size? It doesn't hurt to leave itin, unless...

+    size_t alen = strlen("allocation");

...Better than strlen() would be sizeof(query), as long as the trailingNUL is not changing the size of the array.

+
+    if (len == 0) {
+        if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
+            meta->base_allocation = true;
+        }
+        return 1;
+    }

Okay, so here, the user requested "base:"; on list we return allcontexts that we know (base:allocation); on set we fall through.

+
+    if (len != alen) {
+        return nbd_opt_skip(client, len, errp);
+    }

Here, the user requested "base:garbage", where the garbage (includingempty string on set) is a different length than "base:allocation". Itmay be a valid string for a future NBD version, but for us, we knowright away it is is not something we recognize, so we gracefully skip it.

Checking myself: if nbd_opt_skip returned -1, we have detectedcommunication problems with the client; it does not matter if there isany unparsed data remaining in the current option. It can only return 0if nbd_opt_invalid has already finished parsing the entire option (weare ready to parse the next NBD_OPT command, no further queries in thecurrent option matter). It can only return 1 if we finished parsing thecurrent query, and are positioned ready to parse the next query. [1]

+
+    ret = nbd_opt_read(client, query, len, errp);
+    if (ret <= 0) {
+        return ret;
+    }
+
+    if (strncmp(query, "allocation", alen) == 0) {

Here, you HAD to use strncmp because you didn't leave room for thetrailing NUL in the array above. Tradeoffs. So I guess your approachis okay.

+        meta->base_allocation = true;
+    }
+
+    return 1;

And if we get here, the user requested exactly "base:allocation", so weenabled exactly that context.

+}
+
+/* nbd_negotiate_meta_query
+ *
+ * Parse namespace name and call corresponding function to parse body of the
+ * query.
+ *
+ * The only supported namespace now is 'base'.
+ *
+ * The function aims not wasting time and memory to read long unknown namespace
+ * names.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_negotiate_meta_query(NBDClient *client,
+                                    NBDExportMetaContexts *meta, Error **errp)
+{
+    int ret;
+    char query[sizeof("base:") - 1];
+    size_t baselen = strlen("base:");

And since this matches the approach in the previous function, we'll keepit consistent.

+    uint32_t len;
+
+    ret = nbd_opt_read(client, &len, sizeof(len), errp);
+    if (ret <= 0) {
+        return ret;
+    }
+    cpu_to_be32s(&len);
+
+    /* The only supported namespace for now is 'base'. So query should start
+     * with 'base:'. Otherwise, we can ignore it and skip the remainder. */
+    if (len < baselen) {
+        return nbd_opt_skip(client, len, errp);
+    }
+
+    len -= baselen;
+    ret = nbd_opt_read(client, query, baselen, errp);
+    if (ret <= 0) {
+        return ret;
+    }
+    if (strncmp(query, "base:", baselen) != 0) {


Again, strncmp is a bit awkward compared to strcmp, but it works.

+        return nbd_opt_skip(client, len, errp);
+    }
+
+    return nbd_meta_base_query(client, meta, len, errp);
+}
+
+/* nbd_negotiate_meta_queries
+ * Handle NBD_OPT_LIST_META_CONTEXT and NBD_OPT_SET_META_CONTEXT
+ *
+ * @meta may be NULL, if caller isn't interested in selected contexts (for
+ *     NBD_OPT_LIST_META_CONTEXT)
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */

Comment is wrong - this function never returns 1 (nor should it, asnbd_negotiate_options() expects a return of 1 only from NBD_OPT_ABORT).

+static int nbd_negotiate_meta_queries(NBDClient *client,
+                                      NBDExportMetaContexts *meta, Error 
**errp)
+{
+    int ret;
+    NBDExport *exp;
+    NBDExportMetaContexts local_meta;
+    uint32_t nb_queries;
+    int i;
+
+    assert(client->structured_reply);

Perhaps worth a comment that this is safe because we already filtered itout at the caller.

+
+    if (!meta) {
+        meta = &local_meta;
+    }
+
+    memset(meta, 0, sizeof(*meta));
+
+    ret = nbd_opt_read_name(client, meta->export_name, NULL, errp);
+    if (ret <= 0) {
+        return ret;
+    }
+
+    exp = nbd_export_find(meta->export_name);
+    if (exp == NULL) {
+        return nbd_opt_drop(client, NBD_REP_ERR_UNKNOWN, errp,
+                            "export '%s' not present", meta->export_name);
+    }
+


It's nice to see my review comments from v1 fixed here ;)

+    ret = nbd_opt_read(client, &nb_queries, sizeof(nb_queries), errp);
+    if (ret <= 0) {
+        return ret;
+    }
+    cpu_to_be32s(&nb_queries);
+
+    for (i = 0; i < nb_queries; ++i) {
+        ret = nbd_negotiate_meta_query(client, meta, errp);
+        if (ret <= 0) {
+            return ret;

[1] Okay, I've convinced myself we are good. We can only early returnfrom this loop if we encountered a disconnect (result -1, either read orwrite to client failed, no further communication is worth trying, so itdoesn't matter if we are left mid-option) or if we encountered aninconsistent length and already replied successfully to the client abouttheir bogus request (result 0, we've already skipped to the end of thecurrent option, ready to parse the next NBD_OPT).

+        }
+    }

Missing: On LIST, if nb_queries is 0 before the loop, then we must replywith ALL supported contexts, rather than none (the behavior for SET iscorrect, though).

+
+    if (meta->base_allocation) {
+        ret = nbd_negotiate_send_meta_context(client, "base:allocation",
+                                              NBD_META_ID_BASE_ALLOCATION,
+                                              errp);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
+    if (ret == 0) {
+        meta->valid = true;
+    }
+
+    return ret;

Code is correct - all early returns and this final return are negativeor 0, where 0 means we parsed the entire NBD_OPT, gave a reply, and theconnection is ready for the next NBD_OPT.

+}
+
  /* nbd_negotiate_options
   * Process all NBD_OPT_* client option commands, during fixed newstyle
   * negotiation.
@@ -856,6 +1064,22 @@ static int nbd_negotiate_options(NBDClient *client, 
uint16_t myflags,
                  }
                  break;

+ case NBD_OPT_LIST_META_CONTEXT:

+            case NBD_OPT_SET_META_CONTEXT:
+                if (!client->structured_reply) {
+                    ret = nbd_opt_invalid(
+                            client, errp,
+                            "request option '%s' when structured reply "
+                            "is not negotiated", nbd_opt_lookup(option));
+                } else if (option == NBD_OPT_LIST_META_CONTEXT) {
+                    ret = nbd_negotiate_meta_queries(client, NULL, errp);
+                } else {
+                    ret = nbd_negotiate_meta_queries(client,
+                                                     &client->export_meta,
+                                                     errp);
+                }


Looks good.

If we WANTED to split this patch into two, then part 1 would be NBD_OPThandling (were we just always return 0 contexts in reply to LIST orSET), and part 2 would be NBD_CMD_BLOCK_STATUS handling plus enablingbase:allocation advertisement during NBD_OPT handling. But I'm notgoing to ask for a split now.

+                break;
+
              default:
                  ret = nbd_opt_drop(client, NBD_REP_ERR_UNSUP, errp,
                                     "Unsupported option %" PRIu32 " (%s)",
@@ -1485,6 +1709,79 @@ static int coroutine_fn 
nbd_co_send_sparse_read(NBDClient *client,
      return ret;
  }

+static int blockstatus_to_extent_be(BlockDriverState *bs, uint64_t offset,

+                                    uint64_t bytes, NBDExtent *extent)
+{
+    uint64_t remaining_bytes = bytes;
+
+    while (remaining_bytes) {
+        uint32_t flags;
+        int64_t num;
+        int ret = bdrv_block_status_above(bs, NULL, offset, remaining_bytes,
+                                          &num, NULL, NULL);
+        if (ret < 0) {
+            return ret;
+        }
+
+        flags = (ret & BDRV_BLOCK_ALLOCATED ? 0 : NBD_STATE_HOLE) |
+                (ret & BDRV_BLOCK_ZERO      ? NBD_STATE_ZERO : 0);

I still need to fix what block_status_above returns for protocol driversper Kevin's review of my byte-based status patches, but that will beduring soft freeze (as it is in the bug fix category); it may have aminor impact to this code. But it shouldn't hold up this series.

@@ -1562,6 +1859,8 @@ static int nbd_co_receive_request(NBDRequestData *req, 
NBDRequest *request,
          valid_flags |= NBD_CMD_FLAG_DF;
      } else if (request->type == NBD_CMD_WRITE_ZEROES) {
          valid_flags |= NBD_CMD_FLAG_NO_HOLE;
+    } else if (request->type == NBD_CMD_BLOCK_STATUS) {
+        valid_flags |= NBD_CMD_FLAG_REQ_ONE;
      }
      if (request->flags & ~valid_flags) {
          error_setg(errp, "unsupported flags for command %s (got 0x%x)",
@@ -1690,6 +1989,17 @@ static coroutine_fn int nbd_handle_request(NBDClient 
*client,

return nbd_send_generic_reply(client, request->handle, ret,

                                        "discard failed", errp);
+    case NBD_CMD_BLOCK_STATUS:
+        if (client->export_meta.valid && client->export_meta.base_allocation) {
+            return nbd_co_send_block_status(client, request->handle,
+                                            blk_bs(exp->blk), request->from,
+                                            request->len,
+                                            NBD_META_ID_BASE_ALLOCATION, errp);

Will obviously be expanded as we add more namespaces (for dirty bitmapqueries), but works for your first cut of just reporting block status.

+        } else {
+            return nbd_send_generic_reply(client, request->handle, -EINVAL,
+                                          "CMD_BLOCK_STATUS not negotiated",
+                                          errp);
+        }
      default:
          msg = g_strdup_printf("invalid request type (%" PRIu32 ") received",
                                request->type);

I'm making tweaks as mentioned above, but this is close enough to getinto softfreeze.


Reviewed-by: Eric Blake <address@hidden>

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-block] [PATCH v2 2/8] nbd/server: add nbd_read_opt_name helper, (continued)
- [Qemu-block] [PATCH v2 2/8] nbd/server: add nbd_read_opt_name helper, Vladimir Sementsov-Ogievskiy, 2018/03/12
  - Re: [Qemu-block] [PATCH v2 2/8] nbd/server: add nbd_read_opt_name helper, Eric Blake, 2018/03/12
- [Qemu-block] [PATCH v2 8/8] iotests: new test 209 for NBD BLOCK_STATUS, Vladimir Sementsov-Ogievskiy, 2018/03/12
  - Re: [Qemu-block] [PATCH v2 8/8] iotests: new test 209 for NBD BLOCK_STATUS, Eric Blake, 2018/03/13
- [Qemu-block] [PATCH v2 5/8] nbd: BLOCK_STATUS for standard get_block_status function: client part, Vladimir Sementsov-Ogievskiy, 2018/03/12
  - Re: [Qemu-block] [PATCH v2 5/8] nbd: BLOCK_STATUS for standard get_block_status function: client part, Eric Blake, 2018/03/13
- [Qemu-block] [PATCH v2 6/8] iotests.py: tiny refactor: move system imports up, Vladimir Sementsov-Ogievskiy, 2018/03/12
- [Qemu-block] [PATCH v2 1/8] nbd/server: add nbd_opt_invalid helper, Vladimir Sementsov-Ogievskiy, 2018/03/12
  - Re: [Qemu-block] [PATCH v2 1/8] nbd/server: add nbd_opt_invalid helper, Eric Blake, 2018/03/12
- [Qemu-block] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part, Vladimir Sementsov-Ogievskiy, 2018/03/12
  - Re: [Qemu-block] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part, Eric Blake <=
    - Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part, Eric Blake, 2018/03/13
- Re: [Qemu-block] [PATCH v2 0/8] nbd block status base:allocation, Vladimir Sementsov-Ogievskiy, 2018/03/12
  - Re: [Qemu-block] [PATCH v2 0/8] nbd block status base:allocation, Eric Blake, 2018/03/12
- Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/8] nbd block status base:allocation, no-reply, 2018/03/12
  - Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/8] nbd block status base:allocation, Eric Blake, 2018/03/12
    - Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/8] nbd block status base:allocation, Eric Blake, 2018/03/13
- Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/8] nbd block status base:allocation, no-reply, 2018/03/13
- Re: [Qemu-block] [PATCH v2 0/8] nbd block status base:allocation, Eric Blake, 2018/03/13
  - Re: [Qemu-block] [PATCH v2 0/8] nbd block status base:allocation, Vladimir Sementsov-Ogievskiy, 2018/03/13

Prev by Date: Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending
Next by Date: Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part
Previous by thread: [Qemu-block] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part
Next by thread: Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part
Index(es):
- Date
- Thread