[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH RFC v2 45/47] qapi: New QMP command query-schema
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH RFC v2 45/47] qapi: New QMP command query-schema for QMP schema introspection |
Date: |
Wed, 29 Jul 2015 11:19:10 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
Eric Blake <address@hidden> writes:
> On 07/28/2015 08:33 AM, Markus Armbruster wrote:
>> Eric Blake <address@hidden> writes:
>>
>>> On 07/01/2015 02:22 PM, Markus Armbruster wrote:
>>>> Caution, rough edges.
>>>
>>> No joke. It doesn't even compile without this fixup to a rebase snafu
>>> (see [0] below):
>>
>> Uh, how did that happen? I compiled it a million times... (except for
>> the last time, obviously).
>
> At least you're not alone; I've done dumb things like that, too. And at
> least it was easy to figure out, so I could test it.
>
>>>>
>>>> FIXME it can generate awfully long lines
>>>
>>> We already have long lines in generated output, but I agree that finding
>>> ways to break it up might be nice. Actually,
>>
>> This one's different in that the length is due to string literals.
>> indent is happy to break lines for us, but not string literals.
>>
>> Longest line is a bit over 4KiB for me.
>>
>
> If we break up string literals, at least use some indentation to make it
> obvious that multiple lines merge to a single array entry. For example
> (after patch 47):
>
> ...
> "{ 'name': ':abr', 'meta-type': 'object', "
> "'members': [ "
> "{ 'name': 'device', 'type': ':acg', 'default': null }, "
> "{ 'name': 'node-name', 'type': ':acg', 'default': null }, "
> "{ 'name': 'snapshot-file', 'type': ':acg' }, "
> "{ 'name': 'snapshot-node-name', 'type': ':acg', 'default': null
> }, "
> "{ 'name': 'format', 'type': ':acg', 'default': null }, "
> "{ 'name': 'mode', 'type': ':afo', 'default': null } ] }, "
> "{ 'name': ... "
Unconventional indentation, but if it helps the reader...
> Oh, and just thought of something while typing: although patch 47 masks
> type names from the end user, it would be VERY worthwhile if the C code
> had strategic comments that pointed back to the qapi type name:
>
> "{ 'name': ':abr', " /* TypeFoo */
> "'meta-type': 'object'"
> ...
If we do that for uses and definitions, we don't need a flag to generate
nice typenames. But doing it for uses might complicate the code too
much.
> (and then there's the '' vs. "" issue which impacts the output, as
> well). And if you do the "," between entries as a separate line, that
> might impact whether you do a trailing "]" on a line by its own (see [3]
> below).
>
>>>> +
>>>> +{ 'struct': 'SchemaInfoEnum',
>>>> + 'data': { 'values': ['str'] } }
>>>
>>> Do we want to document anything about sort ordering of this list? Is it
>>> worth sorting the array by name, to allow clients to bsearch for whether
>>> a particular enum value is supported, rather than having to linear search?
>>
>> Haven't thought about it. We currently emit them in definition order,
>> which is not sorted. Largest enums:
>>
>> Q_KEY_CODE_MAX = 125,
>> BLKDEBUG_EVENT_MAX = 43,
>> BLOCKDEV_DRIVER_MAX = 28,
>> CHARDEV_BACKEND_KIND_MAX = 19,
>> RUN_STATE_MAX = 15,
>> NET_CLIENT_OPTIONS_KIND_MAX = 12,
>>
>> Most enums are small: median is 3.
>>
>> Would libvirt prefer them sorted?
>
> Libvirt can probably live without sorting of enum constants (searching
> Q_KEY_CODE_MAX isn't going to be that noticeable of an impact on O(n)
> vs. O(logn); I predict much more time will be spent searching for type
> "xyz" referenced by command "abc" from the overall array, and repeating
> that search for multiple feature probes).
>
> But if we do want sorting, we need it up front (it will be easier to
> decide to use bsearch down the road if we are guaranteed that output is
> sorted, than it would be to try and learn whether whether we are talking
> to a new vs. old qemu to learn if sorting is in effect because it was
> added as an after-thought).
I agree adding sorting later is prone to add another instance of rarely
used, bit-rotting backward-compatibility code to libvirt, along with the
logic whether to use it. Best avoided.
> And if we don't want sorting, documenting
> that data is NOT guaranteed to be position-dependent, in spite of being
> in a JSON array, is a nice touch.
What do you mean by "position-dependent"?
>>>> +
>>>> +{ 'struct': 'SchemaInfoObjectVariant',
>>>> + 'data': { 'case': 'str',
>>>> + 'members': [ 'SchemaInfoObjectMember' ] } }
>>>
>>> Would it be simpler to just have:
>>>
>>> 'data': { 'case': 'str', 'type': 'str' }
>>>
>>> and make the user refer recursively to the (possibly-implicit) type for
>>> the members?
>>
>> Hmmm...
>>
>> QAPISchemaObjectTypeVariant has members name, type, flat, and a few more
>> that don't matter here.
>>
>> For a non-flat variant with name=N, type=T, my code creates
>>
>> { 'case': 'N', 'members': [ { 'name': 'data', 'type': 'T' } ] }
>>
>> This means when the tag is 'N', we have a member 'data' of type 'T'.
>>
>> For a flat variant, it creates
>>
>> { 'case': 'N', 'members': [ { ... the members of T ... } ] }
>>
>> This means when the tag is 'N', we have all the members of T.
>>
>> If I understand you correctly, you're proposing
>>
>> { 'case': 'N', 'type': 'T' }
>>
>> to mean when the tag is 'N', we have all the members of T. For the flat
>> variant above, we'd create exactly that.
>>
>> T must not have variants, but the schema doesn't reflect that.
>
> That's our current restriction, but it's one we might decide to lift in
> the future. Having a type with two different discriminators could get a
> bit weird to think about, but doesn't seem to be technically impossible.
>
>>
>> For the simple variant, we'd then create
>>
>> { 'case': 'N', 'type': 'TT' }
>>
>> where TT is a new implicit object type with a single member with name
>> data and type T.
>>
>> Correct?
>
> Yes.
I'm starting to like it.
>>> In particular, if we ever decide to allow a flat union to have another
>>> union as a branch, rather than the current restriction that all branches
>>> must be structs, then referring to the type of a branch may be easier
>>> than breaking out all members of a struct.
>>
>> Not sure I completely get you here. Using an object type instead of a
>> member list is obviously more flexible, because for any member list we
>> can make up an object type with the same meaning, but not vice versa.
>
> Indeed, and that's the restriction that I mention we might someday want
> to lift. Or, in (modified, due to inline {} types) qapi terms, if we
> ever want to allow:
>
> { 'union': 'Helper', 'data': { 'number': 'int', 'string': 'str' } }
> { 'enum': 'MyEnum', 'data': [ 'a', 'b' ] }
> { 'union': 'Main', 'base': { 'switch': 'MyEnum' },
> 'discriminator': 'switch', 'data': { 'a': { 'boolean': 'bool' },
> 'b': 'Helper' } }
> { 'command': 'foo', 'data': { 'data': 'Main' } }
>
> then that would permit QMP invocations of:
> { "execute": "foo", "arguments": { "data": {
> "switch": "a", "boolean": true } } }
> { "execute": "foo", "arguments": { "data": {
> "switch": "b", "type": "number", "data": 1 } } }
> { "execute": "foo", "arguments": { "data": {
> "switch": "b", "type": "string", "data": "hello" } } }
>
> which we can express as a list of case/types for the primary variants of
> 'Main' (those types in turn refer to the secondary variants of type
> Helper), but which we cannot express as a [ 'SchemaInfoObjectMember' ]
> list, because the type of the "data" member depends on the secondary
> discriminator that is called into use on the "b" case of the primary
> discriminator.
>
> So I think we're saying the same thing, that a [
> 'SchemaInfoObjectMember' ] can always be written as a reference to an
> object type name, but not all object type names can be broken back into
> an array of SchemaInfoObjectMember (only those types that are pure
> structs without variants); and that although we currently do not allow
> sub-variants within a union, we should not get in the way of that being
> a possible future extension.
>
>>
>>> And if that's the case, it
>>> may have knock-on simplifications to your earlier patches for tracking
>>> variants. See [1] below for more thoughts...
>>>
>>> Do we want to guarantee anything about the sort ordering in this list?
>>
>> Again, haven't thought about it.
>>
>> Do we expect member lists to get so large binary search is called for?
>
> Probably not, since such a list would be unwieldy for both client and
> server. We tend to add boxing and optional sub-structs rather than
> direct parameters if we have that much information to pass along (think
> about how adding throttling parameters to a new block device was done
> with a single top-level parameter pointing to a throttling sub-struct,
> rather than adding lots of throttling parameters at top level).
>
> But, as with enum sorting, actually documenting our choice will help
> cement the expectations of clients on what they have to do when learning
> if a parameter was added.
We may want to adopt a consistent rule on sorting stuff.
>>>> +
>>>> +{ 'struct': 'SchemaInfoObject',
>>>> + 'data': { 'members': [ 'SchemaInfoObjectMember' ],
>>>> + '*tag': 'str',
>>>> + '*variants': [ 'SchemaInfoObjectVariant' ] } }
>>>
>>> or these?
>>
>> Same question.
>>
>
> Here, if enums are sorted, then case branches within variants should be
> sorted. If enums are unsorted, then I'm fine if case branches are also
> unsorted (and possibly in a different order than the enum was),
Okay.
> but be
> sure to document that.
Documentation I produce tends to err on the side of brevity, sometimes
too much.
If something is meant to produce sorted results, I document the
sortedness. If not, I habitually say nothing. Since nothing's said,
you better assume nothing.
Please keep calling out instances of excessive brevity :)
>>>> +
>>>> +{ 'struct': 'SchemaInfoAlternate',
>>>> + 'data': { 'members': [ 'SchemaInfoObjectMember' ] } }
>>>
>>> Here's an example of what you generated:
>>> "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'members': [ {
>>> 'name': 'definition', 'type': 'BlockdevOptions' }, { 'name':
>>> 'reference', 'type': 'str' } ] }, "
>>>
>>> I think you could get away with something simpler:
>>>
>>> 'data': { 'types': [ 'str' ] }
>>>
>>> as in:
>>> "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'types': [
>>> 'BlockdevOptions', 'str' ] }, "
>>
>> I.e. have a list of types instead of a list of members.
>>
>> Let's see what we'd lose, by enumerating the members of
>> SchemaInfoObjectMember:
>>
>> * name: not ABI, should not be examined (see commit message), thus no
>> loss.
>>
>> * type: kept.
>>
>> * default: never present (see commit message), thus no loss
>>
>>> the only worry is whether we might want future extensions, where we'd
>>> want additional information per element of that array, vs. being forced
>>> to return two arrays in parallel (arrays of structs are more extensible
>>> than arrays of strings).
>>> Seems like this would be just a
>>
>> Yes?
>>
>> Choices:
>>
>> * The only piece of information we need on an alternative right now is
>> the type, so make members a list of types. Nice now, awkward if we
>> ever need more,
>>
>> * To provide for future additions, make it a list of
>> SchemaInfoAlternateMember, where SchemaInfoAlternateMember has just
>> one member type now.
>>
>> * Reuse existing SchemaInfoObjectMember, because that's close and I'm
>> lazy.
>>
>> Preferences?
>
> At this point, my vote is with a new SchemaInfoAlternateMember class
> (SchemaInfoObjectMember may diverge in a different direction, and it
> would eliminate the question of how to not expose the branch names as
> ABI; but keeping things as a (one-member, for now) dictionary will allow
> future extensions).
Since Kevin also questioned my reuse of SchemaInfoObjectMember for
alternates in his review of RFC v1, let's go with a separate
SchemaInfoAlternateMember.
>>>> @@ -265,7 +265,7 @@ class QAPISchemaGenCommandVisitor(QAPISchemaVisitor):
>>>> self.defn = None
>>>> self.regy = None
>>>> self.visited_rets = None
>>>> - def visit_begin(self):
>>>> + def visit_begin(self, schema):
>>>
>>> And again my python object-oriented newness is showing through; where I
>>> guess all children have to update signatures to still be polymorphic to
>>> a parent adding a parameter.
>>
>> Maybe they need to be updated, maybe not, but updating them neatly
>> sidesteps Python polymorphism questions from mere mortals like you and
>> me.
>
> see reference to [4] below
>
>>
>>>> +''',
>>>> + c_name=c_name(name))
>>>> + self.defn = mcgen('''
>>>> +char %(c_name)s[] = "["
>>>> + "%(c_jsons)s]";
>>>
>>> And again. Also, I'd consider putting the "]" on its own line, like the
>>> "[" was, so that you can more easily cut and paste individual lines of
>>> generated output (but since JSON doesn't allow trailing comma, I guess
>>> the last line is still always going to be special).
>>
>> That's my reason for keepint the "]" there.
>>
>
> [3] and that works for me, unless we want to break long lines into
> shorter ones by virtue of putting "," (and thus "]") on their own line
> to make it even more obvious where we are breaking between elements of
> the overall array.
>
>>>> +''',
>>>> + c_name=c_name(name),
>>>> + c_jsons=', "\n "'.join(self.jsons))
>>>
>>> Cool syntax :)
>>
>> Possibly bordering on too cool :)
>
> Java was the first language I encountered where "".foo() was valid
> syntax; sometimes, I wish C had made strings a first class type. I'm
> fine with it as it is.
>
>>> [1] Ah, so .flat is still in use here, to avoid having to create
>>> implicit types everywhere. But if we create implicit types for simple
>>> unions, and just track variants by their case name/type instead of case
>>> name/[members], it will allow us to have a union as a case branch (I
>>> don't know that we need that much flexibility), and not have to worry
>>> about exposing .flat everywhere. It may even result in a smaller JSON
>>> string (you'd have to play with it to know for sure).
>>
>> We should try hard to get the introspection schema right from the start.
>>
>> But the internal representation is malleable. I know we can implement
>> non-flat unions completely in terms of flat ones (and that means no
>> .flat), but I also know I failed at doing it in this series. I'm not
>> sure your idea will do the trick completely. It's fine, we can finish
>> the job later.
>
> Yes, I've come to that conclusion myself - doesn't matter what we do on
> the first round internally as long as the output is right; cleaning up
> the internals can come later.
>
>
>> I could shoehorn both views into a single visitor function, by passing
>> both views, .base + .local_members, and .members. All implementations
>> will use only one of the views, but it's not immediately obvious which
>> one. So I chose to have two visitor functions. Matter of taste.
>
> I can live with it (documentation that sub-classes should override at
> most only one of the two visitors might help the cause, though).
Actually, overriding both would be just fine. We just haven't had a use
for that.
>>>> +++ b/scripts/qapi.py
>>>> @@ -764,7 +764,7 @@ class QAPISchemaEntity(object):
>>>> pass
>>>>
>>>> class QAPISchemaVisitor(object):
>>>> - def visit_begin(self):
>>>> + def visit_begin(self, schema):
>>>
>>> I don't know enough python to know if making schema optional in the
>>> parent class affects what the child class is allowed to implement while
>>> still overriding things.
>>
>> Not sure I get you here.
>
> It was an idle musing on whether
>
> class QAPISchemaVisitor(object):
> def visit_begin(self, schema=None):
>
> would permit:
> class QAPIIntrospectVisitor(QAPISchemaVisitor):
> def visit_begin(self):
>
> with proper polymorphism. But you've already come to the conclusion
> above [4] that it's easier to not mess with optional parameters (leave
> that for the python gurus), and that mere mortals are better off using
> something that obviously works instead of requiring knowledge about
> language internals.
>
>
>>>> +++ b/tests/.gitignore
>>>> @@ -19,6 +19,7 @@ test-opts-visitor
>>>> test-qapi-event.[ch]
>>>> test-qapi-types.[ch]
>>>> test-qapi-visit.[ch]
>>>> +test-qapi-introspect.[ch]
>>>
>>> [2] Ah, maybe this is the file that wasn't quite right.
>>
>> Anything I need to fix?
>
> The generated files created by 'make check' wer named
> test-qmp-introspect.[ch] (either s/qapi/qmp/ here, or else fix a
> Makefile rule to generate the desired name).
Scales falling from my eyes...
>> Thanks! I really appreciate your review. Must have been a big chunk of
>> work.
>
> Yes, doing a thorough review took me the better part of a week to go
> through it all, so I can only imagine how much time you've invested in
> it. Hopefully v3 will be easier to review, because it will be diffs
> against this version and mainly focusing on whether review comments were
> addressed.
I'll do my best to keep the diffs in check.
- Re: [Qemu-devel] [PATCH RFC v2 33/47] qapi: Clean up after recent conversions to QAPISchemaVisitor, (continued)
[Qemu-devel] [PATCH RFC v2 19/47] qapi: Generated code cleanup, Markus Armbruster, 2015/07/01
[Qemu-devel] [PATCH RFC v2 45/47] qapi: New QMP command query-schema for QMP schema introspection, Markus Armbruster, 2015/07/01
[Qemu-devel] [PATCH RFC v2 44/47] qapi: Pseudo-type '**' is now unused, drop it, Markus Armbruster, 2015/07/01
[Qemu-devel] [PATCH RFC v2 40/47] qapi: Introduce a first class 'any' type, Markus Armbruster, 2015/07/01
[Qemu-devel] [PATCH RFC v2 36/47] qapi: Rename qmp_marshal_input_FOO() to qmp_marshal_FOO(), Markus Armbruster, 2015/07/01