qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait a


From: Jan Kiszka
Subject: Re: [Qemu-devel] [RFC PATCH] fix select(2) race between main_loop_wait and qemu_aio_wait
Date: Mon, 05 Mar 2012 10:07:47 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2012-03-05 09:34, Paolo Bonzini wrote:
> This is quite ugly.  Two threads, one running main_loop_wait and
> one running qemu_aio_wait, can race with each other on running the
> same iohandler.  The result is that an iohandler could run while the
> underlying socket is not readable or writable, with possibly ill effects.

Hmm, isn't it a problem already that a socket is polled by two threads
at the same time? Can't that be avoided?

Long-term, I'd like to cut out certain file descriptors from the main
loop and process them completely in separate threads (for separate
locking, prioritization etc.). Dunno how NBD works, but maybe it should
be reworked like this already.

Jan

> 
> This shows as a failure to boot an IDE disk using the NBD device.
> We can consider it a bug in NBD or in the main loop.  The patch fixes
> this in main_loop_wait, which is always going to lose the race because
> qemu_aio_wait runs select with the global lock held.
> 
> Reported-by: Laurent Vivier <address@hidden>
> Signed-off-by: Paolo Bonzini <address@hidden>
> ---
>       Anthony, if you think this is too ugly tell me and I can
>       post an NBD fix too.
> 
>  main-loop.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/main-loop.c b/main-loop.c
> index db23de0..3beccff 100644
> --- a/main-loop.c
> +++ b/main-loop.c
> @@ -458,6 +458,13 @@ int main_loop_wait(int nonblocking)
>  
>      if (timeout > 0) {
>          qemu_mutex_lock_iothread();
> +
> +        /* Poll again.  A qemu_aio_wait() on another thread
> +         * could have made the fdsets stale.
> +         */
> +        tv.tv_sec = 0;
> +        tv.tv_usec = 0;
> +        ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv);
>      }
>  
>      glib_select_poll(&rfds, &wfds, &xfds, (ret < 0));

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



reply via email to

[Prev in Thread] Current Thread [Next in Thread]