qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Hang with migration multi-thread compression under high


From: Li, Liang Z
Subject: Re: [Qemu-devel] Hang with migration multi-thread compression under high load
Date: Thu, 28 Apr 2016 03:27:39 +0000

> I've been testing various features of migration and have hit a problem with
> the multi-thread compression. It works fine when I have 2 or more threads,
> but if I tell it to only use a single thread, then it almost always hangs
> 
> I'm doing a migration between 2 guests on the same machine over a tcp
> localhost socket, using this command line to launch them:
> 
>   /home/berrange/src/virt/qemu/x86_64-softmmu/qemu-system-x86_64
>      -chardev socket,id=mon,path=/var/tmp/qemu-src-4644-monitor.sock
>      -mon chardev=mon,mode=control
>      -display none
>      -vga none
>      -machine accel=kvm
>      -kernel /boot/vmlinuz-4.4.7-300.fc23.x86_64
>      -initrd /home/berrange/src/virt/qemu/tests/migration/initrd-stress.img
>      -append "noapic edd=off printk.time=1 noreplace-smp
> cgroup_disable=memory pci=noearly console=ttyS0 debug ramsize=1"
>      -chardev stdio,id=cdev0
>      -device isa-serial,chardev=cdev0
>      -m 1536
>      -smp 1
> 
> The target VM also gets
> 
>     -incoming tcp:localhost:9000
> 
> 
> When the VM hangs, the source QEMU shows this stack trace:
> 

What's the mean of  "VM hangs", the VM has no response?
or just the live migration process can't not complete.

I do the test in my environment, it works for me.

Could you try to exec 'info migrate' in qemu monitor on the source side
to check if the live migration process is ongoing, if the 'transferred ram'
keeps unchanged,  it shows dad lock happen.

Liang

> for some reason it isn't shown in the stack thrace for thread
> 1 above, when initially connecting GDB it says the main thread is at:
> 
> decompress_data_with_multi_threads (len=702, host=0x7fd78fe06000,
> f=0x55901af09950) at /home/berrange/src/virt/qemu/migration/ram.c:2254
> 2254          for (idx = 0; idx < thread_count; idx++) {
> 
> 
> Looking at the target QEMU, we see  do_data_decompress method is
> waiting in a condition var:
> 
>         while (!param->start && !quit_decomp_thread) {
>           qemu_cond_wait(&param->cond, &param->mutex);
>             ....do stuff..
>           param->start = false
>         }
> 
> 
> Now the decompress_data_with_multi_threads is checking param->start
> without holding the param->mutex lock.
> 
> Changing decompress_data_with_multi_threads to acquire param->mutex
> lock makes it work, but isn't ideal, since that now blocks the
> decompress_data_with_multi_threads() method on the completion of each
> thread, which defeats the point of having multiple threads.
> 
> 
> As mentioned above I'm only seeing the hang when using 1 decompress
> thread. If it let QEMU have multiple decompress threads everything is fine.
> Also, it only happens if I have a very heavy guest workload.
> If the guest is completely idle, it again works fine. So clearly there is some
> kind of race condition I'm unlucky enough to hit here.
> 
> In terms of monitor commands I'm just running
> 
> 
>   migrate_set_capabilities compress on        (src + dst)
>   migrate_set_parameters compress-threads 1   (src only)
>   migrate_set_parameters decompress-threads 1 (dst only)
> 
> Then migrate -d tcp:localhost:9000
> 
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|


reply via email to

[Prev in Thread] Current Thread [Next in Thread]