qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 3/9] nbd: BLOCK_STATUS for standard get_block_st


From: Eric Blake
Subject: Re: [Qemu-block] [PATCH 3/9] nbd: BLOCK_STATUS for standard get_block_status function: server part
Date: Fri, 16 Feb 2018 11:01:00 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 02/16/2018 08:43 AM, Vladimir Sementsov-Ogievskiy wrote:
16.02.2018 16:21, Eric Blake wrote:
On 02/15/2018 07:51 AM, Vladimir Sementsov-Ogievskiy wrote:
Minimal realization: only one extent in server answer is supported.

Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
---

Again, function comments are useful.

+    if (query[0] == '\0' || strcmp(query, "allocation") == 0) {
+        /* Note: empty query should select all contexts within base
+         * namespace. */
+        meta->base_allocation = true;

From the client perspective, this handling of the empty leaf-name works well for NBD_OPT_LIST_META_CONTEXT (I want to see what leaf nodes the server supports), but not so well for NBD_OPT_SET_META_CONTEXT (asking the server to send ALL base allocations, even when I don't necessarily know how to interpret anything other than base:allocation, is a waste). So this function needs a bool parameter that says whether it is being invoked from _LIST (empty string okay, to advertise ALL base leaf names back to client, which for now is just base:allocation) or from _SET (empty string is ignored as invalid; client has to specifically ask for base:allocation by name).

"empty string is ignored as invalid", hm, do we have this in spec? I think SET and LIST must select exactly same sets of contexts.

No, it is specifically NOT intended that SET and LIST have to produce the same set of contexts; although I do see your point that as currently written, it does not appear to require SET to ignore "base:" or to treat it as an error. At any rate, the spec gives the example of:

In either case, however, for any given namespace the server MAY, instead of 
exhaustively listing every matching context available to select (or every 
context available to select where no query is given), send sufficient context 
records back to allow a client with knowledge of the namespace to select any 
context. This may be helpful where a client can construct algorithmic queries. 
For instance, a client might reply simply with the namespace with no leaf-name 
(e.g. 'X-FooBar:') or with a range of values (e.g. 
'X-ModifiedDate:20160310-20161214'). The semantics of such a reply are a matter 
for the definition of the namespace. However each namespace returned MUST begin 
with the relevant namespace, followed by a colon, and then other UTF-8 
characters, with the entire string following the restrictions for strings set 
out earlier in this document.

with the intent being that for some namespaces, it may be easy to perform an algorithmic query via _LIST to see what ranges are supported, but that you cannot select ALL elements in the range simultaneously. The empty query for _LIST exists to enumerate what is supported, but does not have to equate to an empty query for _SET selecting everything possible. I could even see it being possible to have some round-trips, depending on the namespace (of course, any namespace other than "base:" will be tightly coordinated between both client and server, so they understand each other - the point was that the NBD spec didn't want to constrain what a client and server could do as long as they stay within the generic framework):

C> LIST ""
S> REPLY "base:allocation" id 0
S> REPLY "X-FooBar:" id 0
S> ACK
C> LIST "X-FooBar:"
S> REPLY "X-FooBar:A_Required", id 0
S> REPLY "X-FooBar:A_Min=100", id 0
S> REPLY "X-FooBar:A_Max=200", id 0
S> REPLY "X-FooBar:B_Default=300", id 0
S> REPLY "X-FooBar:B_Min=300", id 0
S> REPLY "X-FooBar:B_Max=400", id 0
S> ACK
C> SET "X-FooBar:A=150" "base:allocation"
S> REPLY "X-FooBar:A=150,B=300", id 1
S> REPLY "base:allocation", id 2
S> ACK

where the global query of all available contexts merely lists that X-FooBar: is understood, but that a specific query is needed for more details (to avoid the client having to parse those specifics if it doesn't care about X-FooBar:), and the specific query sets up the algorithmic description (parameter A is required, between 100 and 200; parameter B is optional, between 300 and 400, defaulting to 300), and the final SET gives the actual request (A given a value, B left to its default; but the reply names the entire context rather than repeating the shorter request). So the spec is written to permit something like that for third-party namespaces, while also trying to be very specific about the "base:" context as that is the one that needs the most interoperability.

It is strange behavior of client to set "base:", but it is its decision. And I don't thing that it is invalid.

For LIST, querying "base:" makes total sense (for the sake of example, we may add "base:frob" down the road that does something new. Being able to LIST whether "base:" turns into just "base:allocation" or into "base:allocation"+"base:frob" may be useful to a client that understands both formats and wants to probe if the server is new; and even for a client right now, the client can gracefully note that it doesn't want to select "base:frob"). But for SET, it does not (if "base:" turns into "base:allocation" + "base:frob" down the road, then the server is wasting time preparing the response to "base:frob" for every NBD_CMD_BLOCK_STATUS, and the client is wasting time unpacking from the wire and ignoring it), so having the empty query work on LIST but not on SET makes sense.

Formally we may answer with NBD_REP_ERR_TOO_BIG, but it will look weird, as client see that both base: and base:allocation returns _one_ context, but in one case it is too big. But if we will have several base: contextes, server may fairly answer with NBD_REP_ERR_TOO_BIG.

Hmm, you have a point that while a client can ask for "namespace:", the server should always respond with "namespace:leaf" for the actual contexts that it supports/selects, so that the client knows which leaves it actually got, if it does not fail with ERR_TOO_BIG. You are also right that failing with ERR_TOO_BIG for "base:" seems odd, but it may make more sense for other namespaces.


So, I think for now the code is ok.

Then this is probably worth something to bring up on the NBD list, if we need to tweak wording to be more explicit (whether we should allow/forbid wildcards during SET, or if wildcard queries are intended only for LIST). Sounds like I have more spec emails to write to the NBD list.


Also, I don't see NBD_REP_ERR_TOO_BIG possible reply in NBD_OPT_LIST_META_CONTEXT description. Should it be here?

Yeah, that's probably also worth adding to the upstream spec, even though it already encourages LIST results to send compressed information back that allows a client to contruct valid specific queries, rather than an exhaustive list of selecting everything possible.

+/* nbd_negotiate_meta_query
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_negotiate_meta_query(NBDClient *client,
+                                    NBDExportMetaContexts *meta, Error **errp)
+{
+    int ret;
+    char *query, *colon, *namespace, *subquery;

Is it worth stack-allocating query here, so you don't have to g_free() it later?  NBD limits the maximum string to 4k, which is a little bit big for stack allocation (on an operating system with 4k pages, allocating more than 4k on the stack in one function risks missing the guard page on stack overflow), but we also have the benefit that we KNOW that the set of meta-context namespaces that we support have a much smaller maximum limit of what we even care about.

it is not stack allocated, nbd_alloc_read_size_string calls g_malloc.

Hence my question - do we NEED the malloc'd version, or can we get away with a stack-allocated space? Although I then revised my question...



+
+    ret = nbd_alloc_read_size_string(client, &query, errp);
+    if (ret <= 0) {
+        return ret;
+    }
+
+    colon = strchr(query, ':');
+    if (colon == NULL) {
+        ret = nbd_opt_invalid(client, errp, "no colon in query");
+        goto out;

Hmm, that puts a slight wrinkle into my proposal, or else maybe it is something I should bring up on the NBD list.  If we only read 5 characters (because the max namespace WE support is "base:"), but a client asks for namespace "X-longname:", we should gracefully ignore the client's request; while we still want to reply with an error to a client that asks for "garbage" with no colon at all. The question for the NBD spec, then, is whether detecting bad client requests that didn't use colon is mandatory for the server (meaning we MUST read the entire namespace request, and search for the colon) or merely best effort (we only have to read 5 characters, and if we silently ignore instead of diagnose a bad namespace request that was longer than that, oh well). Worded from the client, it switches to a question of whether the client should expect the server to diagnose all requests, or must be prepared for the server to ignore requests even where those requests are bogus. Or, the NBD spec may change slightly to pass namespace and leafname as separate fields, both with lengths, rather than a colon, to make it easier for the server to skip over an unknown namespace/leaf pair without having to parse whether a colon was present.  I'll send that in a separate email (the upstream NBD list doesn't need to see all my review comments on this thread).

... in light of this thread now on the NBD list.


Thank you for careful review!

No problem. We still have some things to sort out on the NBD list as well, but I want to make sure we get something that is likely to work well with other implementations (I'm also trying, on the side, to get nbdkit to support structured reads so I have something available for testing cross-implementation support, but it is slow going).

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]