Re: [Qemu-devel] Error propagation in generated visitors and command mar

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Error propagation in generated visitors and command mar

From:	Markus Armbruster
Subject:	Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
Date:	Fri, 11 Apr 2014 15:41:13 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

Peter Crosthwaite <address@hidden> writes:

> On Thu, Apr 10, 2014 at 1:48 AM, Markus Armbruster <address@hidden> wrote:
>> I stumbled over this while trying to purge error_is_set() from the code.
>>
>>
>> Here's how we commonly use the Error API:
>>
>>     Error *err = NULL;
>>
>>     foo(arg, &err)
>>     if (err) {
>>         goto out;
>>     }
>>     bar(arg, &err)
>>     if (err) {
>>         goto out;
>>     }
>>
>> This ensures that err is null on entry, both for foo() and for bar().
>> Many functions rely on that, like this:
>>
>>     void foo(ArgType arg, Error **errp)
>>     {
>>         if (frobnicate(arg) < 0) {
>>             error_setg(errp, "Can't frobnicate");
>>                                 // This asserts errp != NULL
>>         }
>>     }
>>
>>
>> Here's how some of our visitor code uses the Error API (for real code,
>> check out generated qmp-marshal.c):
>>
>>     Error *err = NULL;
>>     QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args));
>>     Visitor *v = qmp_input_get_visitor(mi);
>>     char *foo = NULL;
>>     char *bar = NULL;
>>
>>     visit_type_str(v, &foo, "foo", &err);
>>     visit_type_str(v, &bar, "bar", &err);
>>     if (err) {
>>         goto out;
>>     }
>>
>> Unlike above, this may pass a non-null errp to the second
>> visit_type_str(), namely when the first one fails.
>>
>> The visitor functions guard against that, like this:
>>
>>     void visit_type_str(Visitor *v, char **obj, const char *name, Error 
>> **errp)
>>     {
>>         if (!error_is_set(errp)) {
>>             v->type_str(v, obj, name, errp);
>>         }
>>     }
>>
>> As discussed before, error_is_set() is almost almost wrong, fragile or
>> unclean.  What if errp is null?  Then we fail to stop visiting after an
>> error.
>
> That should be the callers problem. If you pass a NULL errp then the
> intended semantic is to ignore errors.

The *caller* isn't interested in an error.  But the callee's behavior
should *not* be affected by that at all other than not returning an
error.

In particular, the callee should never continue after an error just
because the caller isn't interested in detailed error information.

That's why "if (error_is_set(errp)) bail" and similar are always wrong
when errp is a parameter that may be null.

>
>>
>> The function could be improved like this:
>>
>>     void visit_type_str(Visitor *v, char **obj, const char *name, Error 
>> **errp)
>>     {
>>         assert(errp);
>
> And this is irregular in that you are now mandating the Error ** and
> thus removing the capability to ignore errors.

It is irregular.  The irregularity is necessary as long as the function
isn't prepared for a null errp.

>>         if (!*errp) {
>>             v->type_str(v, obj, name, errp);
>>         }
>>     }
>>
>>
>> But: is it a good idea to have both patterns in the code?  Should we
>> perhaps use the common pattern for visiting, too?  Like this:
>>
>>     visit_type_str(v, &foo, "foo", &err);
>>     if (err) {
>>         goto out;
>>     }
>>     visit_type_str(v, &bar, "bar", &err);
>>     if (err) {
>>         goto out;
>>     }
>>
>> Then we can assume *errp is clear on function entry, like this:
>>
>>     void visit_type_str(Visitor *v, char **obj, const char *name, Error 
>> **errp)
>>     {
>>         v->type_str(v, obj, name, errp);
>>     }
>>
>> Should execute roughly the same number of conditional branches.
>>
>> Tedious repetition of "if (err) goto out" in the caller, but that's what
>> we do elsewhere, and unlike elsewhere, these one's are generated.
>>
>> Opinions?
>
> I think this code as-is is a good example of what we should do
> elsewhere. The code base has bloated with the if (error) { bail; } on
> every Error ** accepting API call. I proposed a while back a semantic
> that Error ** Accepting APIs perform no action when the error is
> already set to allow for long sequences of calls to run without the
> constant checks. You then report the first error in a catchall at the
> end of the run.
>
> I think this particular code is probably good, provided your case of
> NULL errp is enforced against by the caller.

My point isn't that this technique is bad, only that it's different from
what we do everywhere else, and the two techniques do not combine well.

Here's how we handle errors everywhere else:

    void frob([...], Error **errp)
    {
        Error *err = NULL;

        foo(&err)
        if (err) {
            goto out;
        }
        bar(&err) 
        if (err) {
            goto out;
        }
        [...]
    out:
        error_propagate(errp, err);
        [...]
    }

Both foo() and bar() are never entered with an error set.  Consequently,
they don't check for that case.

If you screw up and call them with an error set, they'll die on the
first error of their own, because error_set() asserts the error is
clear.

You might be tempted to elide err, like this:

        foo(errp)
        if (error_is_set(errp)) {
            goto out;
        }
        bar(errp) 
        if (error_is_set(errp)) {
            goto out;
        }
        [...]
   out:
        [...]

But that's *wrong*, because it executes bar() after foo() failed when
errp is null.  Ignoring errors from frob() also changes what frob()
does!  Not an acceptable interface.

You can elide err only in cases where all you ever do with it is pass it
on unexamined.

An alternative technique is to partly move error checks into the
callees, like this:

        err = NULL;
        foo(&err)
        bar(&err) 
        if (err) {
            goto out;
        }
        [...]
    out:
        error_propagate(errp, err);
        [...]

Now bar() must not do anything when called with an error set.  It needs
to begin with code like this:

        if (error_is_set(errp) {
            return;
        }

Trades bloating fewer call sites against bloating all the functions.
Tradeoff.

Note that adding the check above to bar() makes bar() less suited to the
technique I described first.  Remember, with that technique, bar() must
not be called with an error set, and we rely on an assertion to protect
us against mistakes.  The check above completely bypasses the assertion.

Moreover, having both techniques in the tree is bound to lead to
confusion.  Do I have to check after my function call?  Do I have to
check before doing anything in my function?  Sounds like a surefire
recipe for error handling botches to me.

This is what I mean by "the two techniques do not combine well".  Let's
pick one and stick to it.

The first technique is overwhelmingly prevalent now.  The only holdout
of the second technique is some QAPI-related code.

I'm going to try converting that to the first technique.  If you can
come up with patches converting everything else to the second technique,
we can discuss which one is better :)

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, (continued)
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Eric Blake, 2014/04/09
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Anthony Liguori, 2014/04/09
  - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/11
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Dr. David Alan Gilbert, 2014/04/09
  - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/11
    - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Dr. David Alan Gilbert, 2014/04/11
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Kevin Wolf, 2014/04/10
  - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/11
    - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Kevin Wolf, 2014/04/11
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Peter Crosthwaite, 2014/04/11
  - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster <=
    - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Peter Crosthwaite, 2014/04/11
    - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/14

Prev by Date: Re: [Qemu-devel] paravirtual mouse/tablet, v5
Next by Date: Re: [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics
Previous by thread: Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
Next by thread: Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
Index(es):
- Date
- Thread