qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v9 05/17] qapi: Unbox base members


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v9 05/17] qapi: Unbox base members
Date: Wed, 21 Oct 2015 15:34:19 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 10/20/2015 06:09 AM, Markus Armbruster wrote:
>> Eric Blake <address@hidden> writes:
>> 
>>> Rather than storing a base class as a pointer to a box, just
>>> store the fields of that base class in the same order, so that
>>> a child struct can be safely cast to its parent.
>
> ^^^^
>
>>> Compare to the earlier commit 1e6c1616a "qapi: Generate a nicer
>>> struct for flat unions".
>> 
>> Should we mention the struct can be cast to its base?
>
> As in the ^^^ sentence above? Or did you mean somewhere additionally in
> the comments in the code below?

Wish I could read %-}  Sorry for the noise!

>>> -def gen_struct_fields(members):
>>> +def gen_struct_fields(members, base, nested=False):
>>>      ret = ''
>>> +    if base:
>>> +        if not nested:
>>> +            ret += mcgen('''
>>> +    /* Members inherited from %(c_name)s: */
>>> +''',
>>> +                         c_name=base.c_name())
>>> +        ret += gen_struct_fields(base.local_members, base.base, True)
>>> +        if not nested:
>>> +            ret += mcgen('''
>>> +    /* Own members: */
>>> +''')
>> 
>> Before: gen_struct_fields() emits *own* fields.
>> 
>> After: it emits *all* fields.
>> 
>> Since the signature changes, all callers are visible in the patch, and
>> can be reviewed.
>> 
>> Code looks a bit awkward, but I don't have better ideas.  Perhaps we can
>> do better when we fuse gen_struct() and gen_union().
>
> I haven't tried to do that fusion yet; but we are probably quite close
> to it.  I'll see whether it is worth adding on as an 18/17 in this
> subset of the series, or saving for later.

Doesn't feel urgent to me.

>> This is gen_union().
>> 
>>>  ''',
>>>                  c_name=c_name(name))
>>>      if base:
>>> -        ret += mcgen('''
>>> -    /* Members inherited from %(c_name)s: */
>>> -''',
>>> -                     c_name=c_name(base.name))
>>> -        ret += gen_struct_fields(base.members)
>>> -        ret += mcgen('''
>>> -    /* Own members: */
>>> -''')
>>> +        ret += gen_struct_fields([], base)
>>>      else:
>>>          ret += mcgen('''
>>>      %(c_type)s kind;
>> 
>> Before: emit base fields, then own fields.
>> 
>> After: emit all fields.  Good, but please mention in the commit message
>> that you're touching gen_union().  You could also do this part in a
>> separate patch, but that's probably overkill.
>
> Sure, I can improve the commit message, and maybe split the patch. The
> idea at play is that both structs and unions want to emit all fields, as
> well as strategic comments about which fields are inherited, so a shared
> function is ideal for that; this patch moved code from gen_union() into
> gen_struct_fields() so that gen_struct() could reuse it.

I feel there's a variant record caught inside our struct and union
types, struggling to get out.  It's already out in introspection.  Your
patch is a welcome step towards getting it out elsewhere.

>>>      if base:
>>> -        ret += gen_visit_implicit_struct(base)
>>> +        ret += gen_visit_struct_fields(base.name, base.base,
>>> +                                       base.local_members)
>>>
>>>      ret += mcgen('''
>>>
>> 
>> This change looks innocent enough on first glance, but it's actually
>> quite hairy.
>> 
>> = Before =
>> 
>> The visit_type_FOO_fields() are generated in QAPISchema visit order,
>> i.e. right when QAPISchemaObjectType 'FOO' is visited.
>> 
>> The visit_type_implicit_FOO() are generated on demand, right before
>> their first use.  It's used by visit_type_STRUCT_fields() when STRUCT
>> has base FOO, and by visit_type_UNION() when flat UNION has a member
>> FOO.
>> 
>> Aside: only flat unions, because simple unions are an ugly special case
>> here.  To be unified.
>> 
>> Generating visit_type_implicit_FOO() on demand makes sense, because the
>> function isn't always needed.
>> 
>> Since visit_type_implicit_FOO() calls visit_type_FOO_fields(), and the
>> former can be generated before the latter, we may need a forward
>> declaration.  struct_fields_seen is used to detect this.  See commit
>> 8c3f8e7.
>> 
>> = After =
>> 
>> visit_type_implicit_FOO() is now used only when flat UNION has a member
>> FOO.  May need a forward declaration of visit_type_FOO_fields() anyway.
>> 
>> visit_type_FOO_fields() is now also generated on demand, right before
>> first use other than visit_type_implicit_FOO().  Weird.
>> 
>> The resulting code motion makes the change to generated code difficult
>> to review.
>> 
>> Generating visit_type_FOO_fields() on demand makes less sense, because
>> the function is always needed.  All it can accomplish is saving a few
>> forward declarations, at the cost of making gen_visit_struct_fields()
>> recursive, which begs the recursion termination question, and less
>> uniform generated code.
>> 
>> The naive answer to the recursion termination question is that types
>> can't contain themselves, therefore we can't ever get into a cycle.
>> That begs the next question: why do we need the if name in
>> struct_fields_seen conditional then?  We do need it (turning it into an
>> assertion promptly fails it), but I can't tell why offhand.
>> 
>> I'm sure I could figure out why this works, but I don't want to.  Let's
>> keep the code simple instead: keep generating visit_type_FOO_fields() in
>> QAPISchema visit order, emit necessary forward declarations.
>> 
>> Suggest to factor the code to emit a forward declaration out of
>> gen_visit_implicit_struct() and reuse it in gen_visit_struct_fields().
>
> Sounds like I need to split this patch then, anyways.  I'll see what I
> can come up with for v10.

Looking forward to it.  I'll continue reviewing v9 as time permits.

>>> +++ b/tests/qapi-schema/qapi-schema-test.json
>>> @@ -40,6 +40,10 @@
>>>    'data': { 'string0': 'str',
>>>              'dict1': 'UserDefTwoDict' } }
>>>
>>> +# ensure that we don't have an artificial collision on 'base'
>>> +{ 'struct': 'UserDefThree',
>>> +  'base': 'UserDefOne', 'data': { 'base': 'str' } }
>>> +
>
>> This is the positive test I challenged above.
>
> I can drop it if you don't like it; I felt better with it, but as you
> say, it only proves that 'base' is usable as a member name, and not that
> someone won't pick some other name when refactoring back into a boxed
> base.  See also my comments below about using containers (rather than
> boxes or inlined members), where we may need to deal with the naming
> clash anyways.
>
>>> @@ -218,9 +216,11 @@ static void channel_event(int event,
>>> SpiceChannelEventInfo *info)
>>>      }
>>>
>>>      if (info->flags & SPICE_CHANNEL_EVENT_FLAG_ADDR_EXT) {
>>> -        add_addr_info(client->base, (struct sockaddr *)&info->paddr_ext,
>>> +        add_addr_info((SpiceBasicInfo *)client,
>> 
>> Ah, you're already exploiting the ability to cast to the base type!
>
> Absolutely :)
>
>> 
>> Idea: should we generate a type-safe macro or inline function for this?
>
> Hmm. DO_UPCAST() (and its more powerful underlying container_of())
> doesn't fit here, because we inlined the fields rather than directly
> including the base.

Yes, because it results in slightly more readable code: always simply
p->m instead of things like p->base.base.m when m is inherited (which is
usually of no concern when it's used).

Drawbacks of inlining the base include:

* You have to keep the copies consistent.  Too much bother in
  hand-written code, but trivially safe in generated code like ours.

* Loss of the obvious safe way to convert to base: &p->base

* Loss of the conventional way to convert from base: container_of().

I remember seeing conversion to base in your patch, but not the other
way round.

> Below, I'm using this qapi:
> {'struct':'Parent', 'data':{'i':'int'}}
> {'struct':'Child', 'base':'Parent', 'data':{'b':'bool'}}
>
> What we have without this patch is a box that requires separate
> allocation (two layers of pointers, the boxing that we don't want):
>
> struct Child {
>     Parent *base;
>     bool b;
> };
>
> And this patch takes things to be completely inlined (since that's what
> we did earlier with flat unions), so that there is no intermediate
> structure in the child (and thus nothing for DO_UPCAST() to grab):
>
> struct Child {
>     /* Inherited fields from Parent */
>     int i;
>     /* own fields */
>     bool b;
> };
>
> But there is a third possible layout (if we do it, we should do it for
> flat unions as well), which is using the base as a container rather than
> a box, where at least we don't need separate allocation:
>
> struct Child {
>     Parent base;
>     bool b;
> };
>
> or similarly, but in longhand:
>
> struct Child {
>     struct {
>         int i;
>     } base;
>     bool b;
> };
>
> but then we are back to having to avoid a collision with the C name
> 'base' with a QMP name in own fields, and all client code has to spell
> out child->base.i (instead of the boxed child->base->i, or the inlined
> child->i).

Yes.

> There's also the ugly approach of exposing things in a dual naming
> system via an anonymous union and struct:
>
> struct Child {
>     union {
>         struct {
>             int i;
>         };
>         Parent base;
>     };
>     bool b;
> };
>
> which would allow 'child->i' to be the same storage as 'child->base.i',
> so that clients can use the short spelling when accessing fields but
> also have handy access to the base member for DO_UPCAST().  I'm not sure
> I want to go there, though.

Seems to clever for its own sake :)

> Taking your idea from another review message, you mentioned that we
> could allow 'base' by tweaking c_name() to munge the user's member
> 'base' into 'q_base', while reserving 'base' for ourselves; or we could
> use '_base' for our own use, which shouldn't collide with user's names
> (user names should not start with underscore except for downstream names
> which have double-underscore).  Or use '_b' instead of 'base' - the
> point remains that if we want to avoid a collision but still use the
> container approach, we could pick the C name appropriately.  But
> ultimately, we should either commit to the name munging pattern, or
> outlaw the name that we plan to use internally, or stick to inlined
> members that don't require munging in the first place.
>
> Meanwhile, if we decide that we like the convenience of inlined data,
> you are correct in the idea that we could have the qapi generator
> automatically spit out an appropriate type-specific upcast inline
> function for each type with a base.  So given the same example, this
> might mean:
>
> static inline Parent *qapi_Child_to_Parent(Child *child) {
>     return (Parent*)child;
> }
>
> But while such a representation would add compiler type-safety (hiding
> the cast in generated code, where we can prove the generator knew it was
> safe, and so that clients don't have to cast), it also adds verbosity.
> I can't think of any representation that would be shorter than making
> the clients do the cast, short of using a container rather than inline
> approach.  Even foo(qapi_baseof_Child(child), blah) is longer than
> foo((Parent *)child, blah).
>
> Preferences?

The least verbose naming convention for a conversion function I can
think of right now is TBase(), where T is the name of a type with a
base.  Compare:

    foo((Parent *)child, blah)
    foo(ChildBase(child), blah)

Tolerable?  Worthwhile?

Note: *Base instead of *_base not because I like StudlyCaps (I do not),
but for consistency with *List and *Kind.

>> Turned out nicely, I anticipated more churn.
>
> Yes, that was my feeling as well - we haven't quite used base classes in
> enough places to make the conversion painful - but the longer we wait,
> the more usage of base classes will sneak into .json files making a
> future conversion to a new layout more painful.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]