[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PULL 5/6] console: remove do_safe_dpy_refresh
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PULL 5/6] console: remove do_safe_dpy_refresh |
Date: |
Sun, 23 Jul 2017 15:05:42 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 |
On 18/07/2017 15:56, Laurent Vivier wrote:
> On 18/07/2017 15:07, Laurent Vivier wrote:
>> On 21/06/2017 15:23, Gerd Hoffmann wrote:
>>> Drop the temporary workaround for the broken display updates.
>>> All display adapters are updated, so this should be safe without
>>> causing regressions.
>>
>> It seems it breaks QMP command 'migrate "exec:cat>mig"'.
>>
>> The command hangs and doesn't create the file.
>>
>> It happens with qemu-system-ppc64 on x86 (so TCG mode).
>>
>> my command:
>>
>> ./ppc64-softmmu/qemu-system-ppc64 -serial mon:stdio
>>
>> I wait SLOF fails to find an OS, and:
>>
>> Ctrl-a c
>> (qemu) migrate -d "exec:cat>mig"
>>
>> The file is not created and the command hangs:
>>
>> #0 in __lll_lock_wait
>> #1 in pthread_mutex_lock
>> #2 in qemu_mutex_lock
>> #3 in rcu_init_lock
>> #4 in fork
>> #5 in qemu_fork
>> #6 in qio_channel_command_new_spawn
>> #7 in exec_start_outgoing_migration
>> #8 in qmp_migrate
>> ...
>>
>> It looks like a deadlock.
>
> I think this patch is not the cause of the problem, the one it removes
> just unlocks the deadlock by playing with locks.
>
> We have a rcu_init_lock() on fork() because of:
>
> utils/rcu.c:
>
> static void __attribute__((__constructor__)) rcu_init(void)
> {
> #ifdef CONFIG_POSIX
> pthread_atfork(rcu_init_lock, rcu_init_unlock, rcu_init_unlock);
> #endif
> rcu_init_complete();
> }
>
> The QMP thread hangs on:
>
> (gdb) p rcu_sync_lock
> $1 = {lock = {__data = {__lock = 2, __count = 0, __owner = 23865,
> __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {
> __prev = 0x0, __next = 0x0}},
> __size = "\002\000\000\000\000\000\000\000\071]\000\000\001", '\000'
> <repeats 26 times>, __align = 2}, initialized = true}
>
>
> The lock is already taken by thread 2:
>
> (gdb) info thread
> Id Target Id Frame
> 1 Thread 0x7f1cf02fdf00 (LWP 23864) "qemu-system-ppc"
> 0x00007f1cd914b37d in __lll_lock_wait () from /lib64/libpthread.so.0
> * 2 Thread 0x7f1cc9762700 (LWP 23865) "qemu-system-ppc"
> 0x00007f1cd410daa9 in syscall () from /lib64/libc.so.6
> 3 Thread 0x7f1cbf8d5700 (LWP 23866) "qemu-system-ppc"
> 0x00007f1cd914b37d in __lll_lock_wait () from /lib64/libpthread.so.0
>
> (gdb) bt
> #0 0x00007f1cd410daa9 in syscall () at /lib64/libc.so.6
> #1 0x000055ab028ddda2 in qemu_futex_wait
> #2 0x000055ab028ddda2 in qemu_event_wait
> #3 0x000055ab028eda2b in wait_for_readers
> #4 0x000055ab028eda2b in synchronize_rcu
> #5 0x000055ab028edc5b in call_rcu_thread
> #6 0x00007f1cd914273a in start_thread ()
> #7 0x00007f1cd4113e0f in clone ()
>
> So it seems we cannot fork() from QMP?
> [cc: Paolo]
There have been other similar bugs, as David reported. The plan was to
disable pthread_atfork soon after daemonize (basically assuming that
after daemonize fork is immediately followed by exec), but I've been
lazy and never finished those patches. Looks like it's time.
Paolo