qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Historical QAPI schema parser, "compiled schema", and qapi-schema-di


From: Daniel P . Berrangé
Subject: Re: Historical QAPI schema parser, "compiled schema", and qapi-schema-diff
Date: Thu, 13 Jun 2024 17:12:39 +0100
User-agent: Mutt/2.2.12 (2023-09-09)

On Thu, Jun 13, 2024 at 02:13:15AM -0400, John Snow wrote:
> Hi, recently I've been working on overhauling our QMP documentation; see
> https://jsnow.gitlab.io/qemu/qapi/index.html for a recent work-in-progress
> page showcasing this.
> 
> As part of this project, Markus and I decided it'd be nice to be able to
> auto-generate "Since" information. The short reason for 'why' is because
> since info hard-coded into doc comments may not be accurate with regards to
> the wire protocol availability for a given field when a QAPI definition is
> shared or inherited by multiple sources. If we can generate it, it should
> always be accurate.
> 
> So, I've prototyped three things:
> 
> (1) An out-of-tree fork of the QAPI generator that is capable of parsing
> qemu-commands.hx, qmp-commands.hx, and all versions of our qapi-schema.json
> files going all the way back to v0.12.0.
> 
> It accomplishes this with some fairly brutish hacks that I never expect to
> need to check in to qemu.git.
> 
> (2) A schema "compiler", a QAPI generator module that takes a parsed Schema
> and produces a single-file JSON Schema document that describes every
> command and event as it exists on the wire without using type names or any
> information not considered to be "API".
> 
> This part *would* need to be checked in to qemu.git (if we go in this
> direction.)
> The compiled historical schema would also get checked in, for the QAPI
> parser to reference against to generate the since information.

The upside with checking in every historical schema is that we
have a set of self-contained schemas where you can see everything
at a glance for each version.

The downside with checking in every historical schema is that between
any adjacent pair of schemas 99% of the content is identical. IOW we
are very wasteful of storage.

Looking at your other mail about schema diffs, I wonder if we the
diff format you show there can kill two birds with one stone.

  https://lists.nongnu.org/archive/html/qemu-devel/2024-06/msg02398.html

In my reply I had illustrated a variant of your format:

 - x-query-rdma
 -     returns.human-readable-text: str
 . blockdev-backup
 +     arguments.discard-source: Optional<boolean>
 . migrate
 -    arguments.blk: Optional<boolean>
 -    arguments.inc: Optional<boolean>
 . object-add
 .    arguments.qom-type: enum
 +        'sev-snp-guest'
 +    arguments[sev-guest].legacy-vm-type: Optional<boolean>
 +    arguments[sev-snp-guest].author-key-enabled: Optional<boolean>
 +    arguments[sev-snp-guest].cbitpos: Optional<integer>


Where '.' is just pre-existing context, and +/- have the obvious
meaning for the 2 given versions.

What if, we append a version number to *every* line, and exclusively
use +/-.

Taking just one small command:

 + 6.2.0: x-query-rdma
 + 6.2.0:    returns.human-readable-text: str
 - 9.1.0: x-query-rdma

This tell us 'x-query-rdma' was added in 6.2.0, the
'human-readable-text' parameter arrived at the same
time, and the whole command was then deleted in 9.1.0
That has implicit property deletion, but for completeness
we could be explicit about each property when deleting
a command:

 + 6.2.0: x-query-rdma
 + 6.2.0:    returns.human-readable-text: str
 - 9.1.0:    returns.human-readable-text: str
 - 9.1.0: x-query-rdma

Taking the more complex 'object-add' command

 +  2.0.0: object-add
 +  2.0.0:   arguments.qom-type: enum
 +  2.0.0:     '....'
 + 2.11.0:     'sev-guest'
 +  9.1.0:     'sev-snp-guest'
 + 2.11.0:   arguments[sev-guest].policy: uint32
 + 2.11.0:   arguments[sev-guest].session-file: str
 + 2.11.0:   arguments[sev-guest].dh-cert: str
 +  9.1.0:   arguments[sev-guest].legacy-vm-type: Optional<boolean>
 +  9.1.0:   arguments[sev-snp-guest].author-key-enabled: Optional<boolean>
 +  9.1.0:   arguments[sev-snp-guest].cbitpos: Optional<integer>


IOW, object-add was introduced in 2.0.0. The 'sev-guest' enum
variant was added in 2.11.0 with various fields at the same
time. The 'sev-guest' enum variant got an exctra field in 9.1.0
The 'sev-snp-guest' enum variant was added in 9.1.0 with some
fields.


For fields which change from Optional <-> Required, that could
be modelled simply as parameter deletion + addition in the
same version eg hypothetically lets say the 'sev-guest' field
'policy' had changed, we would see:

 +  2.0.0: object-add
 +  2.0.0:   arguments.qom-type: enum
 +  2.0.0:     '....'
 + 2.11.0:     'sev-guest'
 +  9.1.0:     'sev-snp-guest'
 + 2.11.0:   arguments[sev-guest].policy: uint32
 -  6.2.0:   arguments[sev-guest].policy: uint32
 +  6.2.0:   arguments[sev-guest].policy: Optional<uint32>
 + 2.11.0:   arguments[sev-guest].session-file: str
 + 2.11.0:   arguments[sev-guest].dh-cert: str
 +  9.1.0:   arguments[sev-guest].legacy-vm-type: Optional<boolean>
 +  9.1.0:   arguments[sev-snp-guest].author-key-enabled: Optional<boolean>
 +  9.1.0:   arguments[sev-snp-guest].cbitpos: Optional<integer>


Incidentally, if going down this route, I think I would NOT
have 1 file with the whole schema history, but have 1 file
per command / event. eg qapi/history/object-add.txt,
qapi/history/x-query-rdma.txt, qapi/history/VFIO_MIGRATION.txt,
etc. This will make it trivial for a person to focus in on
changes in the command they care about, likely without even
needing a schema diff tool much of the time, as the per-command
files will often be concise enough you can consider the full
history without filtering.


> (3) A script that can diff two compiled schema, showing a change report
> between two versions. (I sent an email earlier today/yesterday showing
> example output of this script.) This one was more for "fun", but it helped
> prove all the other parts were working correctly, and it might be useful in
> the future when auditing changes during the RC phase. We may well decide to
> commit this script upstream, or one like it.

With a single file containing all deltas, where each line is
version annotated, the "diff" tool becomes little more than
something which can 'grep' for lines in the file which have
a version number within the desired range. In fact it can also
optionally offer something better than a diff, as instead of
showing you only the orignal state and result state, it
can trivially shows you any intermediate changes and what
version they happened with. 

eg if you asked for a diff between 2.0.0 and 9.1.0, and there
was a command or property that was added in 4.0.0 and deleted
in 6.0.0, a traditional diff will not tell you about this. You'll
never notice it ever existed. 

A "history grep" showing the set of changes between 2 versions
will highlight things that come + go, which can be quite
useful for understanding API evolution I think.



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]