[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] RFC: double free in qmp_output_visitor_cleanup()
From: |
Michael Roth |
Subject: |
Re: [Qemu-devel] RFC: double free in qmp_output_visitor_cleanup() |
Date: |
Fri, 16 Mar 2012 12:00:56 -0500 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Fri, Mar 16, 2012 at 03:37:10PM +0100, Laszlo Ersek wrote:
> Hi,
>
> we seem to have found a double free in qmp_output_visitor_cleanup().
> Please read the analysis below (that is based on commit e4e6aa14) and
> please tell me if you'd like me to write a patch for solution (a) or
> solution (b), as described at the bottom.
>
> Paolo wrote a test case to trigger the problem, I'm attaching it.
>
> Thank you,
> Laszlo
>
> -------- Original Message --------
> Date: Fri, 16 Mar 2012 13:55:48 +0100
> From: Laszlo Ersek <address@hidden>
>
> [...]
>
> Consider the following example: we want to put an int (I0) in a dict
> (D1) in a dict (D0). D0 is the root element.
>
> When we start D0 with qmp_output_start_struct(), the stack is empty, so
>
> qmp_output_start_struct()
> -> qmp_output_add_obj(D0)
> -> qmp_output_push_obj(D0) [1]
> -> qmp_output_push(D0) [2]
>
> [1] After this step completes, we have a single entry (entry#0) in the
> stack, and it points to D0. refcnt(D0) equals 1 (still from
> qdict_new()). Entry#0 is an *owning* reference to D0 ("aggregation",
> contributes to refcounts).
>
> [2] In this step we push D0 again, so entry#1 will point at D0 too.
> refcnt(D0) still equals 1. Entry#1 is only a *navigation* reference
> (weak reference etc.), it's only here so that we can continue adding
> elements to D0 after we finished building a subtree below -- see step [6].
>
> Then we start D1:
>
> qmp_output_start_struct()
> -> qmp_output_add(D1)
> -> qdict_put_obj(D0, D1) [3]
> -> qmp_output_push(D1) [4]
>
> [3] refcnt(D1) == 1, from qdict_new(). qdict_put_obj() transfers
> ownership of D1 to D0, without changing refcnt(D1), so that still equals
> 1. At the end of this step, D1 is *owned* by D0.
>
> [4] Here we push D1 to the stack, without changing refcounts. Entry#2 is
> only a *navigation* reference to D1.
>
> Then we add I0:
> qmp_output_type_int()
> -> qmp_output_add(I0)
> -> qdict_put_obj(D1, I0) [5]
>
> [5] In this step we transfer ownership of I0 to D1, utilizing the
> entry#2 *navigation* link to find D1. refcnt(I0) == 1, from
> qint_from_int(). The status quo is as follows:
>
> {Figure-1}
>
> entry#0 --- owns [1] --------------+
> |
> v
>
> entry#1 --- navigates-to [2] ---> D0 (refcnt=1)
>
> |
> owns [3]
> |
> v
>
> entry#2 --- navigates-to [4] ---> D1 (refcnt=1)
>
> |
> owns [5]
> |
> v
>
> I0 (refcnt=1)
>
> Then we close D1 and return to D0:
>
> qmp_output_end_struct()
> -> qmp_output_pop() [6]
>
> [6] In this step we simply throw away (release) entry#2. We could
> continue adding elements to D0 via the entry#1 navigation link, that was
> set up in step [2].
>
> Then we close D0 too:
> qmp_output_end_struct()
> -> qmp_output_pop() [7]
>
> We drop entry#1.
>
> We *never* drop entry #0 during this. After we issued an equal number of
> pushes and pops, that is, when "we're done", this is how it looks:
>
> {Figure-2}
>
> entry#0 --- owns [1] --------------+
> |
> v
>
> D0 (refcnt=1)
>
> |
> owns [3]
> |
> v
>
> D1 (refcnt=1)
>
> |
> owns [5]
> |
> v
>
> I0 (refcnt=1)
>
>
> Now, consider running qmp_output_visitor_cleanup() on {Figure-2}. Since
> there's only entry#0, we remove it, and issue
>
> qobject_decref(D0)
> -> qdict_destroy_obj(D0)
> -> qentry_destroy()
> -> qobject_decref(D1)
> -> qdict_destroy_obj(D1)
> -> qentry_destroy()
> -> qobject_decref(I0)
> -> qint_destroy_obj(I0)
> -> qemu_free(I0)
> -> qemu_free(D1)
> -> qemu_free(D0)
>
> Everything's gone.
>
> Consider instead running qmp_output_visitor_cleanup() on {Figure-1},
> that is, in a state where there have been more qmp_output_start_struct()
> calls than qmp_output_end_struct() calls:
>
> {Figure-1 again}
>
> entry#0 --- owns [1] --------------+
> |
> v
>
> entry#1 --- navigates-to [2] ---> D0 (refcnt=1)
>
> |
> owns [3]
> |
> v
>
> entry#2 --- navigates-to [4] ---> D1 (refcnt=1)
>
> |
> owns [5]
> |
> v
>
> I0 (refcnt=1)
>
> qmp_output_visitor_cleanup()'s loop implements "popping order" (both
> qmp_output_pop() and this loop starts removing entries at "tqh_first").
>
> We issue a qobject_decref(D1) via entry#2. That kills I0 and D1 (in that
> order) for good. Notice that the "owns [3]" link from D0 becomes a
> dangling *owning* pointer!
>
> Then we invoke qobject_decref(D0) via entry#1:
>
> qobject_decref(D0)
> -> qdict_destroy_obj(D0)
> -> qentry_destroy()
> -> qobject_decref(D1)
> -> "--D1->refcnt" -- BOOM
>
>
> -- How to fix this:
>
> (a) When pushing, increase the refcount (ie. turn stack entries #1, #2,
> and so on from navigation links into contributing ownership links).
>
> (b) Alternatively, in qmp_output_visitor_cleanup(), throw away all
> entries except entry#0, and issue a decref on that one alone (because
> it's an owning link).
>
> Notice that qmp_ *input* _visitor_cleanup() does exactly (b): it throws
> away the stack at once (the navigation links in the array), and issues a
> decref on the separately maintained pointer that *owns* the root object.
>
> -- Also, you don't have to have "three levels" to trigger the problem.
> As long as you have at least one "layer of bracketing imbalance" when
> you call cleanup, that is, you have at least entry#0 (owner) and entry#1
> (navigator) on the stack, it will blow up. Entry#1 will kill completely
> whatever entry#0 *thinks* it owns.
>
> -- Why would you call qmp_output_visitor_cleanup() halfway into building
> the object tree? Because you run into an error. For example,
> qmp_output_type_enum() sets an error when an invalid "int" value is
> passed to it (ie. one without a corresponding symbolic name). At that
> point you probably want to "abort" your half-done object tree.
>
> ... In fact this example should provide a good test case. Try to
> serialize/send a native (ie. C) struct hierarchy with a deeply nested
> *invalid* enum. (In C you can easily set it to any value.)
>
>
> I hope I'm not hallucinating things... :)
Nope, it's definitely there, just triggered it easily with simple test case. I
think we've been lucky so far with qmp commands not returning responses that
trigger this, but it's definitely a bug. I'm inclined to go with option
b), just need to look at it some more before sending out a patch. Thanks
for catching this!
>
> Thanks,
> Laszlo
> diff --git a/qapi-schema-test.json b/qapi-schema-test.json
> index 8c7f9f7..9eae350 100644
> --- a/qapi-schema-test.json
> +++ b/qapi-schema-test.json
> @@ -8,7 +8,7 @@
>
> # for testing nested structs
> { 'type': 'UserDefOne',
> - 'data': { 'integer': 'int', 'string': 'str' } }
> + 'data': { 'integer': 'int', 'string': 'str', '*enum1': 'EnumOne' } }
>
> { 'type': 'UserDefTwo',
> 'data': { 'string': 'str',
> diff --git a/test-qmp-output-visitor.c b/test-qmp-output-visitor.c
> index 78e5f2d..4e0ecd7 100644
> --- a/test-qmp-output-visitor.c
> +++ b/test-qmp-output-visitor.c
> @@ -274,6 +274,24 @@ static void
> test_visitor_out_struct_nested(TestOutputVisitorData *data,
> qapi_free_UserDefNested(ud2);
> }
>
> +static void test_visitor_out_struct_errors(TestOutputVisitorData *data,
> + const void *unused)
> +{
> + EnumOne bad_values[] = { ENUM_ONE_MAX, -1 };
> + UserDefOne u = { 0 }, *pu = &u;
> + Error *errp;
> + int i;
> +
> + for (i = 0; i < ARRAY_SIZE(bad_values) ; i++) {
> + errp = NULL;
> + u.has_enum1 = true;
> + u.enum1 = bad_values[i];
> + visit_type_UserDefOne(data->ov, &pu, "unused", &errp);
> + g_assert(error_is_set(&errp) == true);
> + error_free(errp);
> + }
> +}
> +
> typedef struct TestStructList
> {
> TestStruct *value;
> @@ -444,6 +462,8 @@ int main(int argc, char **argv)
> &out_visitor_data, test_visitor_out_struct);
> output_visitor_test_add("/visitor/output/struct-nested",
> &out_visitor_data,
> test_visitor_out_struct_nested);
> + output_visitor_test_add("/visitor/output/struct-errors",
> + &out_visitor_data,
> test_visitor_out_struct_errors);
> output_visitor_test_add("/visitor/output/list",
> &out_visitor_data, test_visitor_out_list);
> output_visitor_test_add("/visitor/output/list-qapi-free",