[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] char: kick main loop after adding a watch
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH] char: kick main loop after adding a watch |
Date: |
Fri, 31 Mar 2017 18:53:56 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 |
On 31/03/2017 18:43, Stefan Hajnoczi wrote:
> The ISA serial port device's output can hang when the pipe on stdout
> becomes full. This is a race condition where the vcpu thread executing
> serial emulation code adds a watch on stdout while the main loop thread
> is blocked in ppoll(2). If no timer or other event wakes up the main
> loop, there will be no further output from the serial device even when
> the pipe becomes writable.
>
> Richard W. M. Jones was able to reproduce the hang on recent versions of
> guestfs-tools-c and libglib2 on Fedora 26 hosts.
>
> This patch kicks the main loop so the next iteration invokes ppoll(2)
> with the watch fd.
>
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1435432
> Reported-by: Richard W. M. Jones <address@hidden>
> Tested-by: Richard W. M. Jones <address@hidden>
> Signed-off-by: Stefan Hajnoczi <address@hidden>
> ---
> chardev/char.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/chardev/char.c b/chardev/char.c
> index 3df1163..6c99c34 100644
> --- a/chardev/char.c
> +++ b/chardev/char.c
> @@ -1059,6 +1059,11 @@ guint qemu_chr_fe_add_watch(CharBackend *be,
> GIOCondition cond,
> tag = g_source_attach(src, NULL);
> g_source_unref(src);
>
> + /* The main loop may be in blocked waiting on events in another thread.
> + * Kick it so the new watch will be added.
> + */
> + qemu_notify_event();
> +
> return tag;
> }
>
>
Thanks for looking at this, I was quite stuck and now I understand
what's going on. However, I don't believe your patch is the right
solution.
According to Richard's bisection, the bug was introduced by the patch
at https://bug761102.bugzilla-attachments.gnome.org/attachment.cgi?id=319699.
The g_wakeup_signal that is removed (actually made conditional) in that
patch is doing exactly the same thing as qemu_notify_event, which is
fishy... It would still be a QEMU bug according to the theory below but,
depending on how they handle backwards-compatibility, they might
consider undoing this change.
glib is expecting QEMU to use g_main_context_acquire around accesses to
GMainContext. However QEMU is not doing that, instead it is taking its
own mutex. So we should add g_main_context_acquire and
g_main_context_release in the two implementations of
os_host_main_loop_wait; these should undo the effect of Frediano's
glib patch.
In all fairness, the docs do say "You must be the owner of a context
before you can call g_main_context_prepare(), g_main_context_query(),
g_main_context_check(), g_main_context_dispatch()". However, it has
worked until now and the documentation does not say exactly why that
is necessary.
Paolo