[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] QEMU question: is eventfd not thread safe?
From: |
Alexey Kardashevskiy |
Subject: |
Re: [Qemu-devel] QEMU question: is eventfd not thread safe? |
Date: |
Sun, 01 Jul 2012 22:57:29 +1000 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 |
On 01/07/12 22:43, Michael S. Tsirkin wrote:
> On Sun, Jul 01, 2012 at 09:06:20PM +1000, Alexey Kardashevskiy wrote:
>> QEMU from qemu.org/master (not qemu-kvm), linux host 3.4, ppc64.
>>
>> Shortly the problem is in the host kernel: closing a file in one thread does
>> not interrupt select() waiting on the same file description in another
>> thread.
>>
>> Longer story is:
>> I'll use VFIO as an example as I hit this when I was debugging it but VFIO
>> itself has nothing to do with the issue.
>>
>> The bug looks like: I start the guest with MSI-capable PCI card passed via
>> VFIO. The guest does dhclient from .bashrc and this dhclient does not
>> receive anything till I press any key.
>> I did not see it for a while as I always start the guest and then typed
>> "dhclient" manually in the guest console.
>>
>> What happens:
>> VFIO initializes INTx eventfd via event_notifier_init() (returns fd=XX) and
>> qemu_set_fd_handler() before the guest starts. Then QEMU starts the guest
>> and enters a loop is os_host_main_loop_wait() which stays in select() until
>> something happens.
>>
>> >From the host kernel prospective, the XX fd was allocated, struct file*
>> >(P1) with eventfd specific private_data allocated and initialized. select()
>> >added a file refcounter (called get_file() in __pollwait()) and started
>> >waiting on file* P1 but not on fd's number XX (what is mmm weird but ok).
>>
>> The guest starts and tries to initialize MSI for the PCI card passed
>> through. The guest does PCI config write and this write creates a second
>> thread in QEMU.
>
> Why does this create a thread btw?
It is the thread where the guest is running I guess (?). This is the backtrace
of this second thread:
(gdb) bt
#0 vfio_enable_msi (vdev=0x110200f0) at
/home/aik/qemu-impreza/hw/ppc/../vfio_pci.c:699
#1 0x000000001036c948 in vfio_pci_write_config (pdev=0x110200f0, addr=0xd2,
val=0x81, len=0x2) at /home/aik/qemu-impreza/hw/ppc/../vfio_pci.c:979
#2 0x00000000101b1388 in pci_host_config_write_common (pci_dev=0x110200f0,
addr=0xd2, limit=0x100, val=0x81, len=0x2)
at /home/aik/qemu-impreza/hw/pci_host.c:54
#3 0x000000001035d70c in finish_write_pci_config (spapr=0x10efa860, buid=0x2,
addr=0xd2, size=0x2, val=0x81, rets=0xd64758)
at /home/aik/qemu-impreza/hw/ppc/../spapr_pci.c:170
#4 0x000000001035d874 in rtas_ibm_write_pci_config (spapr=0x10efa860,
token=0x2010, nargs=0x5, args=0xd64744, nret=0x1, rets=0xd64758)
at /home/aik/qemu-impreza/hw/ppc/../spapr_pci.c:194
#5 0x0000000010360bcc in spapr_rtas_call (spapr=0x10efa860, token=0x2010,
nargs=0x5, args=0xd64744, nret=0x1, rets=0xd64758)
at /home/aik/qemu-impreza/hw/ppc/../spapr_rtas.c:218
#6 0x0000000010358150 in h_rtas (env=0xfffb733f040, spapr=0x10efa860,
opcode=0xf000, args=0xfffb7fea030)
at /home/aik/qemu-impreza/hw/ppc/../spapr_hcall.c:560
#7 0x0000000010358c4c in spapr_hypercall (env=0xfffb733f040, opcode=0xf000,
args=0xfffb7fea030)
at /home/aik/qemu-impreza/hw/ppc/../spapr_hcall.c:734
#8 0x00000000103dab2c in kvm_arch_handle_exit (env=0xfffb733f040,
run=0xfffb7fea000) at /home/aik/qemu-impreza/target-ppc/kvm.c:769
#9 0x00000000103a8a94 in kvm_cpu_exec (env=0xfffb733f040) at
/home/aik/qemu-impreza/kvm-all.c:1536
#10 0x00000000102ede24 in qemu_kvm_cpu_thread_fn (arg=0xfffb733f040) at
/home/aik/qemu-impreza/cpus.c:752
#11 0x00000fffb7ab19f0 in .start_thread () from /lib64/libpthread.so.0
#12 0x00000fffb7706774 in .__clone () from /lib64/libc.so.6
>
>> In this thread QEMU-VFIO closes fd XX which makes the host kernel release
>> fd=XX, release the corresponding file* P1 (fput() in filp_close()) but this
>> does not free this P1 as there is select() waiting on it.
>
> Doesn't qemu remove an fd handler before closing the fd?
> If not that's the bug right there.
QEMU does not literally remove it but it marks the event as deleted and
actually removes it much later from qemu_iohandler_poll() which is called after
os_host_main_loop_wait(). Removal is done by calling qemu_set_fd_handler(fd,
NULL, NULL, vdev);
But even if it did remove it, what would it change?
>
>> eventfd_release() could wake it up but it is called when its refcounter
>> becomes 0 and this won't happen till select() interrupts.
>>
>> Doing MSI init stuff, QEMU-VFIO calls the same event_notifier_init()
>> (returns recycled fd=XX what is correct but confuses) and
>> qemu_set_fd_handler() which adds a handler but select() does not pick it up.
>> The eventfd() (called by event_notifier_init()) allocates new struct file
>> *P2 in the kernel, QEMU-VFIO passes this fd=XX to the kernel's VFIO. When
>> MSI interrupt comes to the host kernel, the VFIO interrupt handler calls
>> eventfd_signal() on the correct file* P2. However do_select() in the kernel
>> does not interrupt to deliver eventfd event as it is still waiting on P1 -
>> nobody interrupted select(). It can only interrupt on stdin events (like
>> typing keys) and INTx interrupt (which does not happen as MSI is enabled).
>>
>> So we need to sync eventfd()/close() calls in one thread with select() in
>> another. Below is the patch which simply sends SIGUSR2 to the main thread
>> (the sample patch is below). Another solution could be adding new IO handler
>> to wake select() up. Or to do something with the kernel but I am not sure
>> what - implementing file_operations::flush for eventfd to do wakeup did not
>> help and I did not dig any further.
>>
>>
>> Does it make sense? What do I miss? What would be the right solution?
>>
>>
>> ---
>> iohandler.c | 1 +
>> main-loop.c | 17 +++++++++++++++++
>> 2 files changed, 18 insertions(+)
>>
>> diff --git a/iohandler.c b/iohandler.c
>> index 3c74de6..54f4c48 100644
>> --- a/iohandler.c
>> +++ b/iohandler.c
>> @@ -77,6 +77,7 @@ int qemu_set_fd_handler2(int fd,
>> ioh->fd_write = fd_write;
>> ioh->opaque = opaque;
>> ioh->deleted = 0;
>> + kill(getpid(), SIGUSR2);
>> }
>> return 0;
>> }
>> diff --git a/main-loop.c b/main-loop.c
>> index eb3b6e6..f65e048 100644
>> --- a/main-loop.c
>> +++ b/main-loop.c
>> @@ -199,10 +199,27 @@ static int qemu_signal_init(void)
>> }
>> #endif
>>
>> +static void sigusr2_print(int signal)
>> +{
>> +}
>> +
>> +static void sigusr2_init(void)
>> +{
>> + struct sigaction action;
>> +
>> + memset(&action, 0, sizeof(action));
>> + sigfillset(&action.sa_mask);
>> + action.sa_handler = sigusr2_print;
>> + action.sa_flags = 0;
>> + sigaction(SIGUSR2, &action, NULL);
>> +}
>> +
>> int main_loop_init(void)
>> {
>> int ret;
>>
>> + sigusr2_init();
>> +
>> qemu_mutex_lock_iothread();
>> ret = qemu_signal_init();
>> if (ret) {
>> --
>> 1.7.10
--
Alexey
Re: [Qemu-devel] QEMU question: is eventfd not thread safe?, Benjamin Herrenschmidt, 2012/07/01
Re: [Qemu-devel] QEMU question: is eventfd not thread safe?, Paolo Bonzini, 2012/07/01