Re: [Qemu-devel] Error propagation in generated visitors and command mar

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Error propagation in generated visitors and command mar

From:	Peter Crosthwaite
Subject:	Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
Date:	Sat, 12 Apr 2014 08:46:15 +1000

On Fri, Apr 11, 2014 at 11:41 PM, Markus Armbruster <address@hidden> wrote:
> Peter Crosthwaite <address@hidden> writes:
>
>> On Thu, Apr 10, 2014 at 1:48 AM, Markus Armbruster <address@hidden> wrote:
>>> I stumbled over this while trying to purge error_is_set() from the code.
>>>
>>>
>>> Here's how we commonly use the Error API:
>>>
>>>     Error *err = NULL;
>>>
>>>     foo(arg, &err)
>>>     if (err) {
>>>         goto out;
>>>     }
>>>     bar(arg, &err)
>>>     if (err) {
>>>         goto out;
>>>     }
>>>
>>> This ensures that err is null on entry, both for foo() and for bar().
>>> Many functions rely on that, like this:
>>>
>>>     void foo(ArgType arg, Error **errp)
>>>     {
>>>         if (frobnicate(arg) < 0) {
>>>             error_setg(errp, "Can't frobnicate");
>>>                                 // This asserts errp != NULL
>>>         }
>>>     }
>>>
>>>
>>> Here's how some of our visitor code uses the Error API (for real code,
>>> check out generated qmp-marshal.c):
>>>
>>>     Error *err = NULL;
>>>     QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args));
>>>     Visitor *v = qmp_input_get_visitor(mi);
>>>     char *foo = NULL;
>>>     char *bar = NULL;
>>>
>>>     visit_type_str(v, &foo, "foo", &err);
>>>     visit_type_str(v, &bar, "bar", &err);
>>>     if (err) {
>>>         goto out;
>>>     }
>>>
>>> Unlike above, this may pass a non-null errp to the second
>>> visit_type_str(), namely when the first one fails.
>>>
>>> The visitor functions guard against that, like this:
>>>
>>>     void visit_type_str(Visitor *v, char **obj, const char *name, Error 
>>> **errp)
>>>     {
>>>         if (!error_is_set(errp)) {
>>>             v->type_str(v, obj, name, errp);
>>>         }
>>>     }
>>>
>>> As discussed before, error_is_set() is almost almost wrong, fragile or
>>> unclean.  What if errp is null?  Then we fail to stop visiting after an
>>> error.
>>
>> That should be the callers problem. If you pass a NULL errp then the
>> intended semantic is to ignore errors.
>
> The *caller* isn't interested in an error.  But the callee's behavior
> should *not* be affected by that at all other than not returning an
> error.
>
> In particular, the callee should never continue after an error just
> because the caller isn't interested in detailed error information.
>

But the error is from a previous call not the current call. Callers
job to inform second call that first one failed (or in current status
quo - not call the second call at all). But its caller job to know the
dependancy otherwise calls are not self contained.

> That's why "if (error_is_set(errp)) bail" and similar are always wrong
> when errp is a parameter that may be null.
>

Agreed. Don't see the problem here though - it's bad caller code too.

>>
>>>
>>> The function could be improved like this:
>>>
>>>     void visit_type_str(Visitor *v, char **obj, const char *name, Error 
>>> **errp)
>>>     {
>>>         assert(errp);
>>
>> And this is irregular in that you are now mandating the Error ** and
>> thus removing the capability to ignore errors.
>
> It is irregular.  The irregularity is necessary as long as the function
> isn't prepared for a null errp.
>

My understanding is all functions should be prepared for NULL errp.

>>>         if (!*errp) {
>>>             v->type_str(v, obj, name, errp);
>>>         }
>>>     }
>>>
>>>
>>> But: is it a good idea to have both patterns in the code?  Should we
>>> perhaps use the common pattern for visiting, too?  Like this:
>>>
>>>     visit_type_str(v, &foo, "foo", &err);
>>>     if (err) {
>>>         goto out;
>>>     }
>>>     visit_type_str(v, &bar, "bar", &err);
>>>     if (err) {
>>>         goto out;
>>>     }
>>>
>>> Then we can assume *errp is clear on function entry, like this:
>>>
>>>     void visit_type_str(Visitor *v, char **obj, const char *name, Error 
>>> **errp)
>>>     {
>>>         v->type_str(v, obj, name, errp);
>>>     }
>>>
>>> Should execute roughly the same number of conditional branches.
>>>
>>> Tedious repetition of "if (err) goto out" in the caller, but that's what
>>> we do elsewhere, and unlike elsewhere, these one's are generated.
>>>
>>> Opinions?
>>
>> I think this code as-is is a good example of what we should do
>> elsewhere. The code base has bloated with the if (error) { bail; } on
>> every Error ** accepting API call. I proposed a while back a semantic
>> that Error ** Accepting APIs perform no action when the error is
>> already set to allow for long sequences of calls to run without the
>> constant checks. You then report the first error in a catchall at the
>> end of the run.
>>
>> I think this particular code is probably good, provided your case of
>> NULL errp is enforced against by the caller.
>
> My point isn't that this technique is bad, only that it's different from
> what we do everywhere else, and the two techniques do not combine well.
>
> Here's how we handle errors everywhere else:
>
>     void frob([...], Error **errp)
>     {
>         Error *err = NULL;
>
>         foo(&err)
>         if (err) {
>             goto out;
>         }
>         bar(&err)
>         if (err) {
>             goto out;
>         }
>         [...]
>     out:
>         error_propagate(errp, err);
>         [...]
>     }
>
> Both foo() and bar() are never entered with an error set.  Consequently,
> they don't check for that case.
>
> If you screw up and call them with an error set, they'll die on the
> first error of their own, because error_set() asserts the error is
> clear.
>
> You might be tempted to elide err, like this:
>
>         foo(errp)
>         if (error_is_set(errp)) {
>             goto out;
>         }
>         bar(errp)
>         if (error_is_set(errp)) {
>             goto out;
>         }
>         [...]
>    out:
>         [...]
>
> But that's *wrong*, because it executes bar() after foo() failed when
> errp is null.  Ignoring errors from frob() also changes what frob()
> does!  Not an acceptable interface.
>
> You can elide err only in cases where all you ever do with it is pass it
> on unexamined.
>

Yeh so you must define an Error * locally in cases where you want the
first call failure to inhibit the second. I think this is reasonable
when its the callers wish (frob()) to behave so.

> An alternative technique is to partly move error checks into the
> callees, like this:
>
>         err = NULL;
>         foo(&err)
>         bar(&err)
>         if (err) {
>             goto out;
>         }
>         [...]
>     out:
>         error_propagate(errp, err);
>         [...]
>
> Now bar() must not do anything when called with an error set.  It needs
> to begin with code like this:
>
>         if (error_is_set(errp) {
>             return;
>         }
>

The other compormise is to only bail the second function in its own
error case. Let it run as normal and when it encounters an already set
error, ignore it. This is useful when you have lots of independant
calls one after the other . For an extreme example, check the
ppc/e500.c device tree API calls which I want to convert to Error API
but having to "if (err) { bail };" every one of them is not going to
fly.

> Trades bloating fewer call sites against bloating all the functions.
> Tradeoff.
>

In many API cases, number of callers greatly outnumbers number of functions.

> Note that adding the check above to bar() makes bar() less suited to the
> technique I described first.  Remember, with that technique, bar() must
> not be called with an error set, and we rely on an assertion to protect
> us against mistakes.  The check above completely bypasses the assertion.
>
> Moreover, having both techniques in the tree is bound to lead to
> confusion.  Do I have to check after my function call?  Do I have to
> check before doing anything in my function?  Sounds like a surefire
> recipe for error handling botches to me.
>
> This is what I mean by "the two techniques do not combine well".  Let's
> pick one and stick to it.
>
> The first technique is overwhelmingly prevalent now.  The only holdout
> of the second technique is some QAPI-related code.
>
> I'm going to try converting that to the first technique.  If you can
> come up with patches converting everything else to the second technique,
> we can discuss which one is better :)
>

Yeh one day. Will be an awesome -ve diffstat if it happens though.

Regards,
Peter

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, (continued)
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Anthony Liguori, 2014/04/09
  - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/11
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Dr. David Alan Gilbert, 2014/04/09
  - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/11
    - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Dr. David Alan Gilbert, 2014/04/11
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Kevin Wolf, 2014/04/10
  - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/11
    - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Kevin Wolf, 2014/04/11
- Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Peter Crosthwaite, 2014/04/11
  - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/11
    - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Peter Crosthwaite <=
    - Re: [Qemu-devel] Error propagation in generated visitors and command marshallers, Markus Armbruster, 2014/04/14

Prev by Date: [Qemu-devel] [PATCH 2/2] tests/.gitignore: Ignore test-rfifolock
Next by Date: Re: [Qemu-devel] [Qemu-ppc] [V2 PATCH 0/6] Refactor AES Instructions
Previous by thread: Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
Next by thread: Re: [Qemu-devel] Error propagation in generated visitors and command marshallers
Index(es):
- Date
- Thread