qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 04/19] qapi: Document type-safety considerati


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v4 04/19] qapi: Document type-safety considerations
Date: Mon, 22 Sep 2014 14:37:00 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Eric Blake <address@hidden> writes:

> Go into more details about the various types of valid expressions
> in a qapi schema, including tweaks to document fixes being done
> later in the current patch series.
>
> Signed-off-by: Eric Blake <address@hidden>
> ---
>  docs/qapi-code-gen.txt | 249 
> ++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 217 insertions(+), 32 deletions(-)
>
> diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
> index 8313ba6..3a79629 100644
> --- a/docs/qapi-code-gen.txt
> +++ b/docs/qapi-code-gen.txt
> @@ -22,35 +22,136 @@ code are used.
>  This file defines the types, commands, and events used by QMP.  It should
>  fully describe the interface used by QMP.
>
> -This file is designed to be loosely based on JSON although it's technically
> -executable Python.  While dictionaries are used, they are parsed as
> -OrderedDicts so that ordering is preserved.
> +A QAPI file is designed to be loosely based on JSON although it's
> +technically executable Python.  A valid QAPI schema consists of a list
> +of top-level expressions, with no commas between them.  While
> +dictionaries are used, they are parsed as OrderedDicts so that
> +ordering is preserved; ordering doesn't matter for top-level
> +expressions, but does matter within 'data' members.  QAPI input is
> +written using 'single quotes' instead of JSON's "double quotes" (in
> +contrast, QMP is strict JSON and only uses "double quotes").  As in
> +JSON, trailing commas are not permitted in arrays or dictionaries.
>
> -There are two basic syntaxes used, type definitions and command definitions.
> +Comments are allowed; anything between an unquoted # and the following
> +newline is ignored.  Although there is not yet a documentation
> +generator, a form of stylized comments has developed for consistently
> +documenting details about an expression and when it was added to the
> +schema.  The documentation is delimited between two lines of ##, then
> +the first line names the expression, an optional overview is provided,
> +then individual documentation about each member of 'data' is provided,
> +and finally, a 'Since: x.y.z' tag lists the release that introduced
> +the expression.  Optional fields are tagged with the phrase
> +'#optional', often with their default value; and extensions added
> +after the expression was first released are also given a '(since
> +x.y.z)' comment.  For example:
>
> -The first syntax defines a type and is represented by a dictionary.  There 
> are
> -three kinds of user-defined types that are supported: complex types,
> -enumeration types and union types.
> +    ##
> +    # @BlockStats:
> +    #
> +    # Statistics of a virtual block device or a block backing device.
> +    #
> +    # @device: #optional If the stats are for a virtual block device, the 
> name
> +    #          corresponding to the virtual block device.
> +    #
> +    # @stats:  A @BlockDeviceStats for the device.
> +    #
> +    # @parent: #optional This describes the file block device if it has one.
> +    #
> +    # @backing: #optional This describes the backing block device if it has 
> one.
> +    #           (Since 2.0)
> +    #
> +    # Since: 0.14.0
> +    ##
> +    { 'type': 'BlockStats',
> +      'data': {'*device': 'str', 'stats': 'BlockDeviceStats',
> +               '*parent': 'BlockStats',
> +               '*backing': 'BlockStats'} }
>
> -Generally speaking, types definitions should always use CamelCase for the 
> type
> -names. Command names should be all lower case with words separated by a 
> hyphen.
> +The schema sets up a series of types, as well as commands and events
> +that will use those types.  Forward references are allowed: the parser
> +scans in two passes, where the first pass learns all type names, and
> +the second validates the schema and generates the code.  This allows
> +the definition of complex structs that can have mutually recursive
> +types, and allows for indefinite nesting of QMP that satisfies the
> +schema.  A type name should not be defined more than once.
>
> +There are six top-level expressions recognized by the parser:
> +'include', 'command', 'type', 'enum', 'union', and 'event'.  There are
> +several built-in types, such as 'int' and 'str'; additionally, the
> +top-level expressions can define complex types, enumeration types, and
> +union types.  The 'command' expression can refer to existing types by
> +name, or list an anonymous type as a dictionary.  Listing a type name
> +inside an array refers to a single-dimension array of that type;
> +multi-dimension arrays are not directly supported (although an array
> +of a complex struct that contains an array member is possible).
> +
> +Generally speaking, types definitions should always use CamelCase for
> +user-defined type names, while built-in types are lowercase. Command
> +names, and field names within a type, should be all lower case with
> +words separated by a hyphen.  However, some existing older commands
> +and complex types use underscore; when extending such expressions,
> +consistency is preferred over blindly avoiding underscore.  Event
> +names should be ALL_CAPS with words separated by underscore.  The
> +special string '**' appears for some commands that manually perform
> +their own type checking rather than relying on the type-safe code
> +produced by the qapi code generators.
> +
> +Any command, type, or field name beginning with "x-" is marked
> +experimental, and may be withdrawn in a future release.  Downstream
> +vendors may add extensions; such extensions should begin with a prefix
> +matching "__RFQDN_" (for the reverse-fully-qualified-domain-name of
> +the vendor), even if the rest of the command or field name uses dash
> +(example: __com.redhat_drive-mirror).  Other than the dots used in
> +RFQDN of a downstream extension, all command, type, and field names
> +should begin with a letter, and contain only ASCII letters, numbers,
> +dash, and underscore.  It is okay to reuse names that might collide
> +with programming languages; the generator will rename a field named
> +"default" in the QAPI to "q_default" in the generated C code.
> +
> +
> +=== Built-in Types ===
> +
> +The following types are built-in to the parser:
> +  'str' - arbitrary UTF-8 string
> +  'int' - 64-bit signed integer (although the C code may place further
> +          restrictions on acceptable range)
> +  'number' - floating point number
> +  'bool' - JSON value of true or false
> +  'int8', 'int16', 'int32', 'int64' - like 'int', but enforce maximum
> +                                      bit size
> +  'uint8', 'uint16', 'uint32', 'uint64' - unsigned counterparts
> +  'size' - like 'uint64', but allows scaled suffix from command line
> +           visitor
>
>  === Includes ===
>
> +Usage: { 'include': 'str' }
> +
>  The QAPI schema definitions can be modularized using the 'include' directive:
>
>   { 'include': 'path/to/file.json'}
>
>  The directive is evaluated recursively, and include paths are relative to the
> -file using the directive. Multiple includes of the same file are safe.
> +file using the directive. Multiple includes of the same file are
> +safe.  No other keys should appear in the expression, and the include
> +value should be a string.
> +
> +As a matter of style, it is a good idea to have all files be
> +self-contained, but at the moment, nothing prevents an included file
> +from making a forward reference to a type that is only introduced by
> +an outer file.  The parser may be made stricter in the future to
> +prevent incomplete include files.
>
>
>  === Complex types ===
>
> -A complex type is a dictionary containing a single key whose value is a
> -dictionary.  This corresponds to a struct in C or an Object in JSON.  An
> -example of a complex type is:
> +Usage: { 'type': 'str', 'data': 'dict', '*base': 'complex-type-name' }
> +
> +A complex type is a dictionary containing a single 'data' key whose
> +value is a dictionary.  This corresponds to a struct in C or an Object
> +in JSON. Each value of the 'data' dictionary must be the name of a
> +complex, enum, union, or built-in type, or a one-element array
> +containing a type name.  An example of a complex type is:
>
>   { 'type': 'MyType',
>     'data': { 'member1': 'str', 'member2': 'int', '*member3': 'str' } }
> @@ -100,15 +201,42 @@ both fields like this:
>   { "file": "/some/place/my-image",
>     "backing": "/some/place/my-backing-file" }
>
> +
>  === Enumeration types ===
>
> -An enumeration type is a dictionary containing a single key whose value is a
> -list of strings.  An example enumeration is:
> +Usage: { 'enum': 'str', 'data': [ 'str' ] }
> +
> +An enumeration type is a dictionary containing a single 'data' key
> +whose value is a list of strings.  An example enumeration is:
>
>   { 'enum': 'MyEnum', 'data': [ 'value1', 'value2', 'value3' ] }
>
> +Nothing prevents an empty enumeration, although it is probably not
> +useful.  The list of strings should be lower case; if an enum name
> +represents multiple words, use '-' between words.  The string 'max' is
> +not allowed as an enum value, and values should not be repeated.
> +
> +The enumeration values are passed as strings over the QMP protocol,
> +but are encoded as C enum integral values in generated code.  While
> +the C code starts numbering at 0, it is better to use explicit
> +comparisons to enum values than implicit comparisons to 0; the C code
> +will also include a generated enum member ending in _MAX for tracking
> +the size of the enum, useful when using common functions for
> +converting between strings and enum values.  Since the wire format
> +always passes by name, it is acceptable to reorder or add new
> +enumeration members in any location without breaking QMP clients;
> +however, removing enum values would break compatibility.  For any
> +complex type that has a field that will only contain a finite set of
> +string values, using an enum type for that field is better than
> +open-coding the field to be type 'str'.
> +
> +
>  === Union types ===
>
> +Usage: { 'union': 'str', 'data': 'dict', '*base': 'complex-type-name',
> +         '*discriminator': 'enum-type-name' }
> +or:    { 'union': 'str', 'data': 'dict', 'discriminator': {} }
> +

Suggest to split usage into simple union, flat union and anonymous
union, like this:

Usage: { 'union': 'str', 'data': 'dict', '*base': 'complex-type-name' }
or:    { 'union': 'str', 'data': 'dict', 'base': 'complex-type-name',
         '*discriminator': 'enum-type-name' }
or:    { 'union': 'str', 'data': 'dict', 'discriminator': {} }

>  Union types are used to let the user choose between several different data
>  types.  A union type is defined using a dictionary as explained in the
>  following paragraphs.
> @@ -153,8 +281,10 @@ And it looks like this on the wire:
>
>  Flat union types avoid the nesting on the wire. They are used whenever a
>  specific field of the base type is declared as the discriminator ('type' is
> -then no longer generated). The discriminator must be of enumeration type.
> -The above example can then be modified as follows:
> +then no longer generated). The discriminator must be of enumeration
> +type, and the keys of the 'data' dictionary must match the enumeration
> +keys (although not necessarily in the same order). The above example
> +can then be modified as follows:
>
>   { 'enum': 'BlockdevDriver', 'data': [ 'raw', 'qcow2' ] }
>   { 'type': 'BlockdevCommonOptions',
> @@ -200,23 +330,78 @@ This example allows using both of the following example 
> objects:
>
>  === Commands ===
>
> -Commands are defined by using a list containing three members.  The first
> -member is the command name, the second member is a dictionary containing
> -arguments, and the third member is the return type.
> +Usage: { 'command': 'str', '*data': 'dict-or-complex-type-name',
> +         '*returns': 'type',
> +         '*gen': 'no', '*success-response': 'no' }
>
> -An example command is:
> +Commands are defined by using a dictionary containing several members,
> +where three members are most common.

Suggest paragraph break here.

> +                                      The 'data' member is optional; if
> +absent, the command accepts an optional empty dictionary.  If present,
> +it must be the string name of a complex type, a one-element array
> +containing the name of a complex type, or a dictionary that declares
> +an anonymous type with the same semantics as a 'type' expression, with
> +one exception noted below.

Suggest to say explicitly that 'data' specifies command arguments.

Suggest paragraph break here.

> +                            The 'returns' member is optional; if
> +absent, the command returns an empty dictionary.  If present, it must
> +be the string name of a complex or built-in type, a one-element array
> +containing the name of a complex or built-in type, or a dictionary
> +that declares an anonymous type with the same semantics as a 'type'
> +expression, with one exception noted below.
> +
> +Although it is permitted to have the 'returns' member name a built-in
> +type, any command that does this cannot be extended to return
> +additional information in the future; thus, new commands should
> +strongly consider returning a dictionary-based type, even if it only
> +contains one field at the present.  Besides, all commands return a
> +dictionary to report failure.

Suggest to say explicitly that 'returns' specifies the reply on success.

Suggest to replace the last sentence by a separate paragraph discussing
the reply on failure.

> +
> +Some example commands:
> +
> + { 'command': 'my-first-command',
> +   'data': { 'arg1': 'str', '*arg2': 'str' } }
> + { 'type': 'MyType', 'data': { '*value': 'str' } }
> + { 'command': 'my-second-command',
> +   'returns': [ 'MyType' ] }
> +
> +which would validate this QMP transaction:
> +
> + => { "execute": "my-first-command",
> +      "arguments": { "arg1": "hello" } }
> + <= { "return": { } }
> + => { "execute": "my-second-command" }
> + <= { "return": [ { "value": "one" }, { } ] }
> +
> +In rare cases, QAPI cannot express a type-safe representation of a
> +corresponding QMP command.  In these cases, if the command expression
> +includes the key 'gen' with value 'no', then the 'data' or 'returns'

The implementation actually ignores the value of 'gen', but specifying
it must be 'no' doesn't hurt.

> +member that intends to bypass generated type-safety and do its own
> +manual validation should use '**' rather than a valid type name.
> +Please try to avoid adding new commands that rely on this, and instead
> +use type-safe unions.  For an example of bypass usage:
> +
> + { 'command': 'netdev_add',
> +   'data': {'type': 'str', 'id': 'str', '*props': '**'},
> +   'gen': 'no' }
> +
> +Normally, the QAPI schema is used to describe synchronous exchanges,
> +where a response is expected.  But in some cases, the action of a
> +command is expected to change state in a way that a successful
> +response is not possible (a failure message still returns a
> +dictionary).  In this case, the command expression should include the
> +optional key 'success-response' with value 'no'.

Learned something new here.

Suggest to change the parenthesis to something like (it still sends a
normal error reply on failure).

>
> - { 'command': 'my-command',
> -   'data': { 'arg1': 'str', '*arg2': 'str' },
> -   'returns': 'str' }
>
>  === Events ===
>
> -Events are defined with the keyword 'event'.  When 'data' is also specified,
> -additional info will be included in the event.  Finally there will be C API
> -generated in qapi-event.h; when called by QEMU code, a message with timestamp
> -will be emitted on the wire.  If timestamp is -1, it means failure to 
> retrieve
> -host time.
> +Usage: { 'event': 'str', '*data': 'dict-or-complex-type-name' }
> +
> +Events are defined with the keyword 'event'.  It is not allowed to
> +name an event 'MAX', since the generator also produces a C enumeration
> +of all event names with a generated _MAX value at the end.

One of the several places where the generator can thoughtlessly produce
clashing identifiers.  You're documenting one of them, which is an
improvement of sorts.

>                                                              When
> +'data' is also specified, additional info will be included in the
> +event, with similar semantics to a 'type' expression.  Finally there
> +will be C API generated in qapi-event.h; when called by QEMU code, a
> +message with timestamp will be emitted on the wire.  If timestamp is
> +-1, it means failure to retrieve host time.

As far as I can tell, this can happen only if gettimeofday() fails.
Highly unlikely.  But since the code could do it, the spec should
mention it.

>
>  An example event is:
>
> @@ -311,7 +496,7 @@ Example:
>      #ifndef EXAMPLE_QAPI_TYPES_H
>      #define EXAMPLE_QAPI_TYPES_H
>
> -[Builtin types omitted...]
> +[Built-in types omitted...]
>
>      typedef struct UserDefOne UserDefOne;
>
> @@ -324,7 +509,7 @@ Example:
>          struct UserDefOneList *next;
>      } UserDefOneList;
>
> -[Functions on builtin types omitted...]
> +[Functions on built-in types omitted...]
>
>      struct UserDefOne
>      {
> @@ -423,7 +608,7 @@ Example:
>      #ifndef EXAMPLE_QAPI_VISIT_H
>      #define EXAMPLE_QAPI_VISIT_H
>
> -[Visitors for builtin types omitted...]
> +[Visitors for built-in types omitted...]
>
>      void visit_type_UserDefOne(Visitor *m, UserDefOne **obj, const char 
> *name, Error **errp);
>      void visit_type_UserDefOneList(Visitor *m, UserDefOneList **obj, const 
> char *name, Error **errp);

Major improvement, thank you very much!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]