qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race


From: Peter Lieven
Subject: Re: [Qemu-devel] race between kvm-kmod-3.0 and kvm-kmod-3.3 // was: race condition in qemu-kvm-1.0.1
Date: Thu, 28 Jun 2012 11:11:37 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110921 Thunderbird/3.1.15

On 27.06.2012 18:54, Jan Kiszka wrote:
On 2012-06-27 17:39, Peter Lieven wrote:
Hi all,

i debugged this further and found out that kvm-kmod-3.0 is working with
qemu-kvm-1.0.1 while kvm-kmod-3.3 and kvm-kmod-3.4 are not. What is
working as well is kvm-kmod-3.4 with an old userspace (qemu-kvm-0.13.0).
Has anyone a clue which new KVM feature could cause this if a vcpu is in
an infinite loop?
Before accusing kvm-kmod ;), can you check if the effect is visible with
an original Linux 3.3.x or 3.4.x kernel as well?
sorry, i should have been more specific. maybe I also misunderstood sth.
I was believing that kvm-kmod-3.0 is basically what is in vanialla kernel
3.0. If I use the ubuntu kernel from ubuntu oneiric (3.0.0) it works, if I use
a self-compiled kvm-kmod-3.3/3.4 with that kernel it doesn't.
however, maybe we don't have to dig to deep - see below.
Then, bisection the change in qemu-kvm that apparently resolved the
issue would be interesting.

If we have to dig deeper, tracing [1] the lockup would likely be helpful
(all events of the qemu process, not just KVM related ones: trace-cmd
record -e all qemu-system-x86_64 ...).
that here is bascially whats going on:

qemu-kvm-1.0-2506 [010] 60996.908000: kvm_mmio: mmio read len 3 gpa 0xa0000 val 0x10ff qemu-kvm-1.0-2506 [010] 60996.908000: vcpu_match_mmio: gva 0xa0000 gpa 0xa0000 Read GPA qemu-kvm-1.0-2506 [010] 60996.908000: kvm_mmio: mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0 qemu-kvm-1.0-2506 [010] 60996.908000: kvm_userspace_exit: reason KVM_EXIT_MMIO (6) qemu-kvm-1.0-2506 [010] 60996.908000: kvm_mmio: mmio read len 3 gpa 0xa0000 val 0x10ff qemu-kvm-1.0-2506 [010] 60996.908000: vcpu_match_mmio: gva 0xa0000 gpa 0xa0000 Read GPA qemu-kvm-1.0-2506 [010] 60996.908000: kvm_mmio: mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0 qemu-kvm-1.0-2506 [010] 60996.908000: kvm_userspace_exit: reason KVM_EXIT_MMIO (6) qemu-kvm-1.0-2506 [010] 60996.908000: kvm_mmio: mmio read len 3 gpa 0xa0000 val 0x10ff qemu-kvm-1.0-2506 [010] 60996.908000: vcpu_match_mmio: gva 0xa0000 gpa 0xa0000 Read GPA qemu-kvm-1.0-2506 [010] 60996.908000: kvm_mmio: mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0 qemu-kvm-1.0-2506 [010] 60996.908000: kvm_userspace_exit: reason KVM_EXIT_MMIO (6) qemu-kvm-1.0-2506 [010] 60996.908000: kvm_mmio: mmio read len 3 gpa 0xa0000 val 0x10ff qemu-kvm-1.0-2506 [010] 60996.908000: vcpu_match_mmio: gva 0xa0000 gpa 0xa0000 Read GPA qemu-kvm-1.0-2506 [010] 60996.908000: kvm_mmio: mmio unsatisfied-read len 1 gpa 0xa0000 val 0x0 qemu-kvm-1.0-2506 [010] 60996.908000: kvm_userspace_exit: reason KVM_EXIT_MMIO (6)

its doing that forever. this is tracing the kvm module. doing the qemu-system-x86_64 trace is a bit compilcated, but maybe this is already sufficient. otherwise i will of course gather this info as well.

thanks
peter


Jan

[1] http://www.linux-kvm.org/page/Tracing

---

Hi,

we recently came across multiple VMs racing and stopping working. It
seems to happen when the system is at 100% cpu.
One way to reproduce this is:
qemu-kvm-1.0.1 with vnc-thread enabled

cmdline (or similar):
/usr/bin/qemu-kvm-1.0.1 -net
tap,vlan=141,script=no,downscript=no,ifname=tap15,vnet_hdr -net
nic,vlan=141,model=virtio,macaddr=52:54:00:ff:00:f7 -drive
format=host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-efdf4e007-16700198c7f4fead-02-debug-race-hd01,if=virtio,cache=none,aio=native
-m 2048 -smp 2,sockets=1,cores=2,threads=1 -monitor
tcp:0:4026,server,nowait -vnc :26 -qmp tcp:0:3026,server,nowait -name
02-debug-race -boot order=dc,menu=off -cdrom
/home/kvm/cdrom//root/ubuntu-12.04-server-amd64.iso -k de -pidfile
/var/run/qemu/vm-221.pid -mem-prealloc -cpu
host,+x2apic,model_id=Intel(R) Xeon(R) CPU           L5640  @
2.27GHz,-tsc -rtc base=utc -usb -usbdevice tablet -no-hpet -vga cirrus

it is important that the attached virtio image contains only zeroes. if
the system boots from cd, select boot from first harddisk.
the hypervisor then hangs at 100% cpu and neither monitor nor qmp are
responsive anymore.

i have also seen customers reporting this when a VM is shut down.

if this is connected to the threaded vnc server it might be important to
connected at this time.

debug backtrace attached.

Thanks,
Peter

--

(gdb) file /usr/bin/qemu-kvm-1.0.1
Reading symbols from /usr/bin/qemu-kvm-1.0.1...done.
(gdb) attach 5145
Attaching to program: /usr/bin/qemu-kvm-1.0.1, process 5145
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
[Thread debugging using libthread_db enabled]
[New Thread 0x7f54d08b9700 (LWP 5253)]
[New Thread 0x7f5552757700 (LWP 5152)]
[New Thread 0x7f5552f58700 (LWP 5151)]
0x00007f5553c6b5a3 in select () from /lib/libc.so.6
(gdb) info threads
   4 Thread 0x7f5552f58700 (LWP 5151)  0x00007f5553c6a747 in ioctl ()
from /lib/libc.so.6
   3 Thread 0x7f5552757700 (LWP 5152)  0x00007f5553c6a747 in ioctl ()
from /lib/libc.so.6
   2 Thread 0x7f54d08b9700 (LWP 5253)  0x00007f5553f1a85c in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
* 1 Thread 0x7f555550d700 (LWP 5145)  0x00007f5553c6b5a3 in select ()
from /lib/libc.so.6
(gdb) thread apply all bt

Thread 4 (Thread 0x7f5552f58700 (LWP 5151)):
#0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
#1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f5557652f10,
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
#2  0x00007f555572728a in kvm_cpu_exec (env=0x7f5557652f10) at
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
#3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f5557652f10) at
/usr/src/qemu-kvm-1.0.1/cpus.c:740
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f5552757700 (LWP 5152)):
#0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
#1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f555766ae60,
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
#2  0x00007f555572728a in kvm_cpu_exec (env=0x7f555766ae60) at
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
#3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f555766ae60) at
/usr/src/qemu-kvm-1.0.1/cpus.c:740
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)):
#0  0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
#1  0x00007f5555679f5d in qemu_cond_wait (cond=0x7f5557ede1e0,
mutex=0x7f5557ede210) at qemu-thread-posix.c:113
#2  0x00007f55556b06a1 in vnc_worker_thread_loop (queue=0x7f5557ede1e0)
at ui/vnc-jobs-async.c:222
#3  0x00007f55556b0b7f in vnc_worker_thread (arg=0x7f5557ede1e0) at
ui/vnc-jobs-async.c:318
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f555550d700 (LWP 5145)):
#0  0x00007f5553c6b5a3 in select () from /lib/libc.so.6
#1  0x00007f55556516be in main_loop_wait (nonblocking=0) at main-loop.c:456
#2  0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:1482
#3  0x00007f555564c698 in main (argc=38, argv=0x7ffff9d894a8,
envp=0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523
(gdb) thread apply all bt full

Thread 4 (Thread 0x7f5552f58700 (LWP 5151)):
#0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f5557652f10,
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
         ret = 32597
         arg = 0x0
         ap = {{gp_offset = 24, fp_offset = 48, overflow_arg_area =
0x7f5552f57e50, reg_save_area = 0x7f5552f57d90}}
#2  0x00007f555572728a in kvm_cpu_exec (env=0x7f5557652f10) at
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
         run = 0x7f55553e2000
         ret = 0
         run_ret = 0
#3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f5557652f10) at
/usr/src/qemu-kvm-1.0.1/cpus.c:740
         env = 0x7f5557652f10
         r = 0
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
No symbol table info available.
#6  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 3 (Thread 0x7f5552757700 (LWP 5152)):
#0  0x00007f5553c6a747 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1  0x00007f5555727830 in kvm_vcpu_ioctl (env=0x7f555766ae60,
type=44672) at /usr/src/qemu-kvm-1.0.1/kvm-all.c:1101
         ret = 0
         arg = 0x0
         ap = {{gp_offset = 24, fp_offset = 48, overflow_arg_area =
0x7f5552756e50, reg_save_area = 0x7f5552756d90}}
#2  0x00007f555572728a in kvm_cpu_exec (env=0x7f555766ae60) at
/usr/src/qemu-kvm-1.0.1/kvm-all.c:987
         run = 0x7f55553df000
         ret = 32597
         run_ret = 1433358864
#3  0x00007f55556f5c08 in qemu_kvm_cpu_thread_fn (arg=0x7f555766ae60) at
/usr/src/qemu-kvm-1.0.1/cpus.c:740
         env = 0x7f555766ae60
         r = 65536
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
No symbol table info available.
#6  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 2 (Thread 0x7f54d08b9700 (LWP 5253)):
#0  0x00007f5553f1a85c in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
---Type<return>  to continue, or q<return>  to quit---
No symbol table info available.
#1  0x00007f5555679f5d in qemu_cond_wait (cond=0x7f5557ede1e0,
mutex=0x7f5557ede210) at qemu-thread-posix.c:113
         err = 32597
         __func__ = "qemu_cond_wait"
#2  0x00007f55556b06a1 in vnc_worker_thread_loop (queue=0x7f5557ede1e0)
at ui/vnc-jobs-async.c:222
         job = 0x7f5557edd510
         entry = 0x0
         tmp = 0x0
         vs = {csock = -1, ds = 0x7f5557e8ec40, dirty = {{0, 0, 0}
<repeats 2048 times>}, lossy_rect = 0x7f5557edd570, vd = 0x7f54d08ba010,
need_update = 0, force_update = 0,
           features = 227, absolute = 0, last_x = 0, last_y = 0,
client_width = 0, client_height = 0, vnc_encoding = 6, major = 0, minor
= 0, auth = 0,
           challenge = '\000'<repeats 15 times>, info = 0x0, output =
{capacity = 1230913, offset = 1448, buffer = 0x7f5558176d60 ""}, input =
{capacity = 0, offset = 0,
             buffer = 0x0}, write_pixels = 0x7f55556b2aaf
<vnc_write_pixels_generic>, clientds = {flags = 0 '\000', width = 640,
height = 480, linesize = 1280,
             data = 0x7f54d0c00000 "B\a", pf = {bits_per_pixel = 32 ' ',
bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 16711680, gmask =
65280, bmask = 255, amask = 0,
               rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000',
ashift = 0 '\000', rmax = 255 '\377', gmax = 255 '\377', bmax = 255
'\377', amax = 0 '\000',
               rbits = 8 '\b', gbits = 8 '\b', bbits = 8 '\b', abits = 0
'\000'}}, audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt =
AUD_FMT_U8, endianness = 0},
           read_handler = 0, read_handler_expect = 0, modifiers_state =
'\000'<repeats 255 times>, led = 0x0, abort = false, output_mutex =
{lock = {__data = {__lock = 0,
                 __count = 0, __owner = 0, __nusers = 0, __kind = 0,
__spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000'
<repeats 39 times>, __align = 0}},
           bh = 0x0, jobs_buffer = {capacity = 0, offset = 0, buffer =
0x0}, tight = {type = 0, quality = 255 '\377', compression = 9 '\t',
pixel24 = 0 '\000', tight = {
               capacity = 0, offset = 0, buffer = 0x0}, tmp = {capacity =
0, offset = 0, buffer = 0x0}, zlib = {capacity = 0, offset = 0, buffer =
0x0}, gradient = {
               capacity = 0, offset = 0, buffer = 0x0}, levels = {0, 0,
0, 0}, stream = {{next_in = 0x0, avail_in = 0, total_in = 0, next_out =
0x0, avail_out = 0,
                 total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree
= 0, opaque = 0x0, data_type = 0, adler = 0, reserved = 0}, {next_in =
0x0, avail_in = 0, total_in = 0,
                 next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0,
state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler =
0, reserved = 0}, {
                 next_in = 0x0, avail_in = 0, total_in = 0, next_out =
0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0,
zfree = 0, opaque = 0x0,
                 data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0,
avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out =
0, msg = 0x0, state = 0x0,
                 zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0,
adler = 0, reserved = 0}}}, zlib = {zlib = {capacity = 1229488, offset =
1152000,
               buffer = 0x7f5557fe9320 ""}, tmp = {capacity = 1230913,
offset = 20, buffer = 0x7f5558176d60 ""}, stream = {next_in =
0x7f5558102720 "\030\b ", avail_in = 0,
               total_in = 7658880, next_out = 0x7f5558177308 "",
avail_out = 1229465, total_out = 49027, msg = 0x0, state = 0x7f55581155e0,
               zalloc = 0x7f55556a622f<vnc_zlib_zalloc>, zfree =
0x7f55556a626c<vnc_zlib_zfree>, opaque = 0x7f54d08a6810, data_type = 0,
adler = 197164569, reserved = 0},
             level = 9}, hextile = {send_tile = 0x7f55556a1180
<send_hextile_tile_generic_32>}, zrle = {type = 0, fb = {capacity = 0,
offset = 0, buffer = 0x0}, zrle = {
               capacity = 0, offset = 0, buffer = 0x0}, tmp = {capacity =
0, offset = 0, buffer = 0x0}, zlib = {capacity = 0, offset = 0, buffer =
0x0}, stream = {next_in = 0x0,
               avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0,
total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque =
0x0, data_type = 0, adler = 0,
               reserved = 0}, palette = {pool = {{idx = 0, color = 0,
next = {le_next = 0x0, le_prev = 0x0}}<repeats 256 times>}, size = 0,
max = 0, bpp = 0, table = {{
                   lh_first = 0x0}<repeats 256 times>}}}, zywrle = {buf
= {0<repeats 4096 times>}}, mouse_mode_notifier = {notify = 0, node =
{tqe_next = 0x0,
               tqe_prev = 0x0}}, next = {tqe_next = 0x0, tqe_prev = 0x0}}
         n_rectangles = 1
         saved_offset = 2
#3  0x00007f55556b0b7f in vnc_worker_thread (arg=0x7f5557ede1e0) at
ui/vnc-jobs-async.c:318
         queue = 0x7f5557ede1e0
#4  0x00007f5553f159ca in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5  0x00007f5553c72cdd in clone () from /lib/libc.so.6
No symbol table info available.
#6  0x0000000000000000 in ?? ()
No symbol table info available.

---Type<return>  to continue, or q<return>  to quit---
Thread 1 (Thread 0x7f555550d700 (LWP 5145)):
#0  0x00007f5553c6b5a3 in select () from /lib/libc.so.6
No symbol table info available.
#1  0x00007f55556516be in main_loop_wait (nonblocking=0) at main-loop.c:456
         rfds = {fds_bits = {14197552, 0<repeats 15 times>}}
         wfds = {fds_bits = {0<repeats 16 times>}}
         xfds = {fds_bits = {0<repeats 16 times>}}
         ret = 32597
         nfds = 23
         tv = {tv_sec = 0, tv_usec = 817147}
         timeout = 1000
#2  0x00007f5555647ad0 in main_loop () at /usr/src/qemu-kvm-1.0.1/vl.c:1482
         nonblocking = false
         last_io = 1
#3  0x00007f555564c698 in main (argc=38, argv=0x7ffff9d894a8,
envp=0x7ffff9d895e0) at /usr/src/qemu-kvm-1.0.1/vl.c:3523
         gdbstub_dev = 0x0
         i = 64
         snapshot = 0
         linux_boot = 0
         icount_option = 0x0
         initrd_filename = 0x0
         kernel_filename = 0x0
         kernel_cmdline = 0x7f55557d8bef ""
         boot_devices = "dc", '\000'<repeats 30 times>
         ds = 0x7f5557e8ec40
         dcl = 0x0
         cyls = 0
         heads = 0
         secs = 0
         translation = 0
         hda_opts = 0x0
         opts = 0x7f5557637cb0
         olist = 0x7ffff9d89268
         optind = 38
         optarg = 0x7ffff9d89bed "cirrus"
         loadvm = 0x0
         machine = 0x7f5555b53500
         cpu_model = 0x7ffff9d89b6f "host,+x2apic,model_id=Intel(R)
Xeon(R) CPU", ' '<repeats 11 times>, "L5640  @ 2.27GHz,-tsc"
         pid_file = 0x7ffff9d89b43 "/var/run/qemu/vm-221.pid"
         incoming = 0x0
         show_vnc_port = 0
         defconfig = 1
         log_mask = 0x0
         log_file = 0x0
         mem_trace = {malloc = 0x7f5555649081<malloc_and_trace>, realloc
= 0x7f55556490b6<realloc_and_trace>, free = 0x7f55556490fa
<free_and_trace>, calloc = 0,
---Type<return>  to continue, or q<return>  to quit---
           try_malloc = 0, try_realloc = 0}
         trace_events = 0x0
         trace_file = 0x0
(gdb)










reply via email to

[Prev in Thread] Current Thread [Next in Thread]