qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC v2 27/47] qapi-visit: Convert to QAPISchemaV


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH RFC v2 27/47] qapi-visit: Convert to QAPISchemaVisitor, fixing bugs
Date: Mon, 27 Jul 2015 19:53:21 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 07/01/2015 02:22 PM, Markus Armbruster wrote:
>> Fixes flat unions to visit the base's base members (the previous
>> commit merely added them to the struct).  Same test case.
>> 
>> Patch's effect on visit_type_UserDefFlatUnion():
>> 
>>      static void visit_type_UserDefFlatUnion_fields(Visitor *m, 
>> UserDefFlatUnion **obj, Error **errp)
>>      {
>>          Error *err = NULL;
>> 
>>     +    visit_type_int(m, &(*obj)->integer, "integer", &err);
>>     +    if (err) {
>>     +        goto out;
>>     +    }
>>          visit_type_str(m, &(*obj)->string, "string", &err);
>>          if (err) {
>>              goto out;
>> 
>> Test cases updated for the bug fix.
>> 
>> Fixes alternates to generate a visitor for their implicit enumeration
>> type.  None of them are currently used, obviously.  Example:
>> block-core.json's BlockdevRef now generates
>> visit_type_BlockdevRefKind().
>> 
>> Signed-off-by: Markus Armbruster <address@hidden>
>> ---
>>  scripts/qapi-visit.py                   | 254 
>> ++++++++++++--------------------
>>  tests/qapi-schema/qapi-schema-test.json |   3 -
>>  tests/test-qmp-input-strict.c           |   2 +-
>>  tests/test-qmp-input-visitor.c          |   4 +-
>>  4 files changed, 100 insertions(+), 163 deletions(-)
>
> Another conversion that results in a fairly large diffstat to the
> generated files:
>
>  qapi-visit.c                        | 4542 
> ++++++++++++++++++------------------
>  qapi-visit.h                        |  256 --
>  qga/qapi-generated/qga-qapi-visit.c |   88
>  qga/qapi-generated/qga-qapi-visit.h |   36
>  4 files changed, 2355 insertions(+), 2567 deletions(-)
>
> Same complaints as in 26/47, where splitting some of the cleanups into
> separate patches would make it easier to validate that the final
> conversion is correct.
>
> Here, a very common thing in the generated .c file is that you end up
> swapping the order of visit_type_foo and visit_type_fooList, for any
> time when foo is an enum. [1]

Again, this is because the old code generates the array visitor from
within the code generating the element type visitor, while the new code
makes all array types full-fledged types, which get visited like any
other type.

>> 
>> diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
>> index a52a572..135e7c1 100644
>> --- a/scripts/qapi-visit.py
>> +++ b/scripts/qapi-visit.py
>> @@ -12,7 +12,6 @@
>>  # This work is licensed under the terms of the GNU GPL, version 2.
>>  # See the COPYING file in the top-level directory.
>>  
>> -from ordereddict import OrderedDict
>>  from qapi import *
>>  import re
>>  
>> @@ -24,13 +23,13 @@ def generate_visit_implicit_struct(type):
>>          return ''
>>      implicit_structs_seen.add(type)
>>      ret = ''
>> -    if type not in struct_fields_seen:
>> +    if type.name not in struct_fields_seen:
>>          # Need a forward declaration
>>          ret += mcgen('''
>>  
>>  static void visit_type_%(c_type)s_fields(Visitor *m, %(c_type)s **obj, 
>> Error **errp);
>>  ''',
>> -                     c_type=type_name(type))
>> +                     c_type=type.c_name())
>
> This looks a little fishy on first read; why are we calling
> type.c_name() (and not type.c_type()) when assigning to a placeholder
> named %(c_type)s?  But on second thought, it looks correct: since we
> really do want the name of the type (and not the magic '*' pointer suffix).

Most of the time, c_type is an identifier naming a type (something like
typ.c_name()), but sometimes it's a type expression (something like
typ.c_type()).  If that turns out to be too confusing, we can use
different names for the two.  Not exactly looking forward to tracking
them down, though...

> Still, it might be nicer to name things %(c_name)s here; and in that
> case, it's more of a pre-existing cleanup that might be better floating
> into one of your earlier patches.

This patch tries very hard not to touch lines without need.  Cleanup is
left for PATCH 33.

I could do some (but not all) cleanup before converting to
QAPISchemaVisitor.  But I'm afraid that just complicates things even
more.

>>  
>>      ret += mcgen('''
>>  
>> @@ -46,7 +45,7 @@ static void visit_type_implicit_%(c_type)s(Visitor *m, 
>> %(c_type)s **obj, Error *
>>      error_propagate(errp, err);
>>  }
>>  ''',
>> -                 c_type=type_name(type))
>> +                 c_type=type.c_name())
>
> same here.
>
>>      return ret
>>  
>>  def generate_visit_struct_fields(name, members, base = None):
>> @@ -74,24 +73,24 @@ if (err) {
            ret += mcgen('''
    visit_type_implicit_%(c_type)s(m, &(*obj)->%(c_name)s, &err);
    if (err) {
>>      goto out;
>>  }
>>  ''',
>> -                     type=type_name(base), c_name=c_name('base'))
>> +                     type=base.c_name(), c_name=c_name('base'))
>
> And this one's pointless: c_name('base') == 'base'. Pointless since
> commit 622f557 introduced type inheritance.  Why do we even need
> %(c_name)s if we are always passing a constant string?

My best guess: to keep the code similar to other code generating
visit_type_FOO() calls?

> Oh, and that means our generator has a collision bug that none of my
> added tests have exposed yet: you cannot have a base class and
> simultaneously add a member named 'base':
>
> { 'struct': 'Base', 'data': { 'i': 'int' } }
> { 'struct': 'Sub', 'base': 'Base', 'data': { 'base': 'str' } }
>
> because the generated C code is trying to use the name 'base' for its
> own purposes.

*sigh*

>                I guess that means more pre-req patches to the series to
> expose the bug, and either tighten the parser to reject things for now
> (easiest) or update the generator to not collide (harder, and fine for a
> later series).

Yes.

Life would be easier if the original authors had adopted sane naming
conventions from the start.

> By the way, now that we are emitting flat unions in such a way that you
> can cast to the base class, why don't we change our C code to do
> likewise?  That is, where we now have this generated C:
>
> struct BlockdevOptionsGenericFormat {
>     BlockdevRef *file;
> };
>
> struct BlockdevOptionsGenericCOWFormat {
>     BlockdevOptionsGenericFormat *base;
>     bool has_backing;
>     BlockdevRef *backing;
> };
>
> why can't we instead have an unboxed representation:
>
> struct BlockdevOptionsGenericFormat {
>     BlockdevRef *file;
> };
>
> /* This struct can be cast to BlockdevOptionsGenericFormat */
> struct BlockdevOptionsGenericCOWFormat {
>     BlockdevRef *file;
>     /* end of fields from base class BlockdevOptionsGenericFormat */
>     bool has_backing;
>     BlockdevRef *backing;
> };
>
> where client code that was referring to o->base->file now refers to o->file.

That's where I want to go, but not in this series.

>> +def gen_visit_union(name, base, variants):
>> +    ret = ''
>>  
>>      if base:
>> -        assert discriminator
>> -        base_fields = find_struct(base)['data'].copy()
>> -        del base_fields[discriminator]
>> -        ret += generate_visit_struct_fields(name, base_fields)
>> +        members = [m for m in base.members if m != variants.tag_member]
>> +        ret += generate_visit_struct_fields(name, members)
>
> Why not just visit ALL base class members, unconditionally?

This patch converts to QAPISchemaVisitor.  It tries hard to keep the
generated code the same.

The old code generates the discriminator visit into the visit_type_FOO()
and not the the visit_type_FOO_fields() helper (can't tell why), so the
new code does the same.

Simply visiting all members makes more sense, but it needs to be done in
a followup patch.  Since this series is already huge, it'll likely be in
a followup series.

>> -    if discriminator:
>> -        for key in members:
>> -            ret += generate_visit_implicit_struct(members[key])
>> +    for var in variants.variants:
>> +        if var.flat:
>> +            ret += generate_visit_implicit_struct(var.type)
>
> Okay, I see where you are using .flat from the initial parse.  I still
> think it is a bit odd that you are defining '.flat' for each 'variant'
> within 'variants', even though, for a given 'variants', all members will
> have the same setting of '.flat'.  That makes me wonder if '.flat'
> should belong instead to the top-level 'variants' struct rather than to
> each 'variant' member.

Two reasons for putting flat where it is:

* The philosophical one: from the generator's point of view, there's no
  fundamental reason why all variants need to be flat or none.  The
  generator really doesn't care.

* The pragmatic one (a.k.a. the real one): there are places where I use
  v.flat, but don't have the variants owning v handy.

> But again I wonder what would happen if you had instead normalized the
> input of simple unions into always having an implicit struct (with
> single member 'data'), so that by the time you get here, you only have
> to deal with a single representation of unions instead of having to
> still emit different things for flat vs. simple (since on the wire, we
> already proved simple is shorthand that can be duplicated by a flat union).

I hope we can get there!  But at this stage of the conversion, I want to
minimize output change, and .flat makes preserving all its warts much
easier.

>>  
>>      ret += mcgen('''
>>  
>> @@ -300,41 +268,39 @@ void visit_type_%(c_name)s(Visitor *m, %(c_name)s 
>> **obj, const char *name, Error
>>  ''',
>>                       name=c_name(name))
>>  
>> -    if not discriminator:
>> -        tag = 'kind'
>> -        disc_key = "type"
>> -    else:
>> -        tag = discriminator
>> -        disc_key = discriminator
>> +    disc_key = variants.tag_member.name
>> +    if not variants.tag_name:
>> +        # we pointlessly use a different key for simple unions
>
> We could fix that (as a separate patch); wonder how much C code it would
> affect.  A lot of these things that we can alter in generated code are
> certainly easier to see now that we have a clean generator :)

Yup, the warts stand out now.

>> +def gen_visit_decl(name, scalar=False):
>> +    c_type = c_name(name) + ' *'
>> +    if not scalar:
>> +        c_type += '*'
>>      return mcgen('''
>> -
>> -void visit_type_%(name)s(Visitor *m, %(name)s *obj, const char *name, Error 
>> **errp);
>> +void visit_type_%(c_name)s(Visitor *m, %(c_type)sobj, const char *name, 
>> Error **errp);
>>  ''',
>> -                 name=c_name(name))
>> +                 c_name=c_name(name), c_type=c_type)
>
> Nice way to consolidate several near-identical copies.
>
>> +
>> +class QAPISchemaGenVisitVisitor(QAPISchemaVisitor):
>> +    def __init__(self):
>> +        self.decl = None
>> +        self.defn = None
>> +        self.btin = None
>> +    def visit_begin(self):
>> +        self.decl = ''
>> +        self.defn = ''
>> +        self.btin = guardstart('QAPI_VISIT_BUILTIN_VISITOR_DECL')
>> +    def visit_end(self):
>> +        # to avoid header dependency hell, we always generate
>> +        # declarations for built-in types in our header files and
>> +        # simply guard them
>> +        self.btin += guardend('QAPI_VISIT_BUILTIN_VISITOR_DECL')
>> +        self.decl = self.btin + self.decl
>> +        self.btin = None
>> +        # ...this doesn't work for cases where we link in multiple
>> +        # objects that have the functions defined, so we use
>> +        # do_builtins (option -b) to provide control
>
> And once again, as in 26/47, this floats the .h file to have all builtin
> representations in one chunk (for continuity with pre-patch), but fails
> to do the same for the .c code...

Yes, because they all need to be under the
QAPI_VISIT_BUILTIN_VISITOR_DECL guard.

>> +    def visit_enum_type(self, name, info, values):
>> +        self.decl += gen_visit_decl(name, scalar=True)
>> +        self.defn += generate_visit_enum(name)
>> +    def visit_array_type(self, name, info, element_type):
>> +        decl = gen_visit_decl(name)
>> +        defn = gen_visit_list(name, element_type)
>> +        if isinstance(element_type, QAPISchemaBuiltinType):
>> +            self.btin += decl
>> +            if do_builtins:
>> +                self.defn += defn
>
> ...where the builtins are now interleaved with everything else instead
> of bunched together, making the generated diff larger and more confusing
> than necessary.

Yes, because there's no need to separate them out.

>> +        else:
>> +            self.decl += decl
>> +            self.defn += defn
>> +    def visit_object_type(self, name, info, base, members, variants):
>> +        if info:
>> +            self.decl += gen_visit_decl(name)
>> +            if variants:
>> +                self.defn += gen_visit_union(name, base, variants)
>
> Worth adding 'assert not members'?

Yes, with a TODO, because it's an artificial restriction I'd like to
lift some day.

>> +            else:
>> +                self.defn += gen_visit_struct(name, base, members)
>
> Or maybe we can someday consolidate these two into a single
> gen_visit_object, that handles all members and variants in a uniform
> manner, instead of our current differences. I wonder how much C code
> would be impacted?

Yes, that's how we'd lift the restriction.

>> +    def visit_alternate_type(self, name, info, variants):
>> +        self.decl += gen_visit_decl(name)
>> +        self.defn += gen_visit_alternate(name, variants)
>>  
>>  do_builtins = False
>>  
>> @@ -442,56 +428,10 @@ fdecl.write(mcgen('''
>>  ''',
>>                    prefix=prefix))
>>  
>> -exprs = QAPISchema(input_file).get_exprs()
>> -
>> -# to avoid header dependency hell, we always generate declarations
>> -# for built-in types in our header files and simply guard them
>> -fdecl.write(guardstart("QAPI_VISIT_BUILTIN_VISITOR_DECL"))
>> -for typename in builtin_types.keys():
>> -    fdecl.write(generate_declaration(typename, builtin_type=True))
>> -fdecl.write(guardend("QAPI_VISIT_BUILTIN_VISITOR_DECL"))
>> -
>> -# ...this doesn't work for cases where we link in multiple objects that
>> -# have the functions defined, so we use -b option to provide control
>> -# over these cases
>> -if do_builtins:
>> -    for typename in builtin_types.keys():
>> -        fdef.write(generate_visit_list(typename))
>
> Again, a well-placed sorted() over these two loops in a pre-req patch
> will minimize the churn on the builtins.

Can try and see.

>> -
>> -for expr in exprs:
>> -    if expr.has_key('struct'):
>> -        ret = generate_visit_struct(expr)
>> -        ret += generate_visit_list(expr['struct'])
>> -        fdef.write(ret)
>> -
>> -        ret = generate_declaration(expr['struct'])
>> -        fdecl.write(ret)
>> -    elif expr.has_key('union'):
>> -        ret = generate_visit_union(expr)
>> -        ret += generate_visit_list(expr['union'])
>> -        fdef.write(ret)
>> -
>> -        enum_define = discriminator_find_enum_define(expr)
>> -        ret = ""
>> -        if not enum_define:
>> -            ret = generate_decl_enum('%sKind' % expr['union'])
>
> Nice that the new visitor automatically visits any implicit enum,
> without us having to special case it.

Making implicit stuff explicit in the internal representation kills
special cases and de-duplicates code :)

>> -        ret += generate_declaration(expr['union'])
>> -        fdecl.write(ret)
>> -    elif expr.has_key('alternate'):
>> -        ret = generate_visit_alternate(expr['alternate'], expr['data'])
>> -        ret += generate_visit_list(expr['alternate'])
>> -        fdef.write(ret)
>> -
>> -        ret = generate_decl_enum('%sKind' % expr['alternate'])
>> -        ret += generate_declaration(expr['alternate'])
>> -        fdecl.write(ret)
>> -    elif expr.has_key('enum'):
>> -        ret = generate_visit_list(expr['enum'])
>> -        ret += generate_visit_enum(expr['enum'])
>
> [1] swapping these two lines in a pre-req patch will minimize the churn
> of this conversion.

Can try and see.

>> -        fdef.write(ret)
>> -
>> -        ret = generate_decl_enum(expr['enum'])
>> -        ret += generate_enum_declaration(expr['enum'])
>> -        fdecl.write(ret)
>> +schema = QAPISchema(input_file)
>> +gen = QAPISchemaGenVisitVisitor()
>> +schema.visit(gen)
>> +fdef.write(gen.defn)
>> +fdecl.write(gen.decl)
>
> Again, overall impression is that your series is headed in the right
> direction.
>
> And nice that the TODOs in the testsuite pointed out what this fixes,
> for visiting indirect bases.

Thanks!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]