qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] the whole virtual machine hangs when IO does not come back!


From: Bin Wu
Subject: [Qemu-devel] the whole virtual machine hangs when IO does not come back!
Date: Mon, 11 Aug 2014 16:33:21 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

Hi,

I tested the reliability of qemu in the IPSAN environment as follows:
(1) create one VM on a X86 server which is connected to an IPSAN, and the VM has only one system volume which is on the IPSAN; (2) disconnect the network between the server and the IPSAN. On the server, I have a "multipath" software which can hold the IO for a long time (configurable) when the network is disconnected; (3) about 30 seconds later, the whole VM hangs there, nothing can be done to the VM!

Then, I used "gstack" tool to collect the stacks of all qemu threads, it looked like:

Thread 8 (Thread 0x7fd840bb5700 (LWP 6671)):
#0  0x00007fd84253a4f6 in poll () from /lib64/libc.so.6
#1  0x00007fd84410ceff in aio_poll ()
#2  0x00007fd84429bb05 in qemu_aio_wait ()
#3  0x00007fd844120f51 in bdrv_drain_all ()
#4  0x00007fd8441f1a4a in bmdma_cmd_writeb ()
#5  0x00007fd8441f216e in bmdma_write ()
#6  0x00007fd8443a93cf in memory_region_write_accessor ()
#7  0x00007fd8443a94a6 in access_with_adjusted_size ()
#8  0x00007fd8443a9901 in memory_region_iorange_write ()
#9  0x00007fd8443a19bd in ioport_writeb_thunk ()
#10 0x00007fd8443a13a8 in ioport_write ()
#11 0x00007fd8443a1f55 in cpu_outb ()
#12 0x00007fd8443a5b12 in kvm_handle_io ()
#13 0x00007fd8443a64a9 in kvm_cpu_exec ()
#14 0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
#15 0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#16 0x00007fd8425439cd in clone () from /lib64/libc.so.6
#17 0x0000000000000000 in ?? ()

Thread 7 (Thread 0x7fd8403b4700 (LWP 6672)):
#0  0x00007fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x00007fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fd8444526bd in qemu_mutex_lock ()
#4  0x00007fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x00007fd8443a63b9 in kvm_cpu_exec ()
#6  0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fd8425439cd in clone () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7fd83fbb3700 (LWP 6673)):
#0  0x00007fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x00007fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fd8444526bd in qemu_mutex_lock ()
#4  0x00007fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x00007fd8443a63b9 in kvm_cpu_exec ()
#6  0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fd8425439cd in clone () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7fd83f3b2700 (LWP 6674)):
#0  0x00007fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x00007fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fd8444526bd in qemu_mutex_lock ()
#4  0x00007fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x00007fd8443a63b9 in kvm_cpu_exec ()
#6  0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fd8425439cd in clone () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7fd83ebb1700 (LWP 6675)):
#0  0x00007fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x00007fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fd8444526bd in qemu_mutex_lock ()
#4  0x00007fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x00007fd8443a63b9 in kvm_cpu_exec ()
#6  0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fd8425439cd in clone () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7fd83e3b0700 (LWP 6676)):
#0  0x00007fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x00007fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fd8444526bd in qemu_mutex_lock ()
#4  0x00007fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x00007fd8443a63b9 in kvm_cpu_exec ()
#6  0x00007fd844330962 in qemu_kvm_cpu_thread_fn ()
#7  0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#8  0x00007fd8425439cd in clone () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7fd23b7ff700 (LWP 6679)):
#0 0x00007fd8427eb61c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fd8444528f0 in qemu_cond_wait ()
#2  0x00007fd844312d9d in vnc_worker_thread_loop ()
#3  0x00007fd844313315 in vnc_worker_thread ()
#4  0x00007fd8427e77b6 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fd8425439cd in clone () from /lib64/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7fd844068840 (LWP 6662)):
#0  0x00007fd8427ee294 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fd8427e9619 in _L_lock_1008 () from /lib64/libpthread.so.0
#2  0x00007fd8427e942e in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fd8444526bd in qemu_mutex_lock ()
#4  0x00007fd844330f47 in qemu_mutex_lock_iothread ()
#5  0x00007fd84429b991 in os_host_main_loop_wait ()
#6  0x00007fd84429ba50 in main_loop_wait ()
#7  0x00007fd844322793 in main_loop ()
#8  0x00007fd844329a9f in main ()

I think the VM hangs there because the VCPU thread holds the global qemu metux lock and waits for IO to come back. However, in my test, the IO doesn't come back (because of the multipath software). Therefore, the VCPU thread never releases the global lock, and other threads can never get the lock. Is there any idea to solve the whole vm hanging problem? I also did the same test on the VMware platform, the IO hangs but the VM is still working. Thanks!






reply via email to

[Prev in Thread] Current Thread [Next in Thread]