qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] can we disable the migration-test for TCG targets ?


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] can we disable the migration-test for TCG targets ?
Date: Tue, 26 Feb 2019 12:17:01 +0000
User-agent: Mutt/1.10.1 (2018-07-13)

* Peter Maydell (address@hidden) wrote:
> On Mon, 25 Feb 2019 at 14:00, Peter Maydell <address@hidden> wrote:
> >
> > Hi; the migration-test is failing really frequently for me
> > for TCG targets (usually on sparc64 host) -- can we just disable
> > it from the test set until we've figured out and implemented a
> > fix for the underlying TCG bug, please?
> 
> Here's a backtrace from the hang.
> 
> Process status:
> pm215    125126 35.8  0.0  17896  4008 pts/2    S+   13:51  18:09
>             \_ tests/migration-test -m=quick -k --tap
> pm215    125450  126  0.2 901808 151184 pts/2   Sl+  13:51  64:17
>             |   \_ aarch64-softmmu/qemu-system-aarch64 -qtest
> unix:/tmp/qtest-125126.
> pm215    125560  0.0  0.1 755368 68960 pts/2    Sl+  13:51   0:00
>             |   \_ aarch64-softmmu/qemu-system-aarch64 -qtest
> unix:/tmp/qtest-125126.
> 
> (This isn't specific to aarch64 guests, I've seen others.)
> 

This doesn't look like the other TCG corruption problem we knew of;
in that bug we had a misordering so that we would lose guest writes
and the migration data that arrived on the destination was
old/inconsistent.

> Backtrace of migration-test:
> #0  0xfff8000100f8f9c8 in __GI___nanosleep
> (requested_time=0x7feff8c2618, remaining=0x0) at
> ../sysdeps/unix/sysv/linux/nanosleep.c:28
> #1  0xfff8000100fbd1f4 in usleep (useconds=<optimized out>) at
> ../sysdeps/posix/usleep.c:32
> #2  0x000001000000a380 in wait_for_migration_pass (who=0x10000161cd0)
> at /home/pm215/qemu/tests/migration-test.c:279
> #3  0x000001000000ab44 in migrate_postcopy_prepare
> (from_ptr=0x7feff8c2868, to_ptr=0x7feff8c2870, hide_error=<optimized
> out>)
>     at /home/pm215/qemu/tests/migration-test.c:627

But here, we've not even got that far - we've not entered postcopy yet;
we're just waiting for either:
   * The migration to finish
   * An iteration to complete


> #4  0x000001000000b384 in test_postcopy () at
> /home/pm215/qemu/tests/migration-test.c:653
> #5  0xfff8000100496370 in ?? () from 
> /usr/lib/sparc64-linux-gnu/libglib-2.0.so.0
> 
> Backtrace of process 125450:
> Thread 6 (Thread 0xfff800012de0b900 (LWP 127434)):
> #0  0xfff80001034c5cdc in futex_abstimed_wait_cancelable (private=0,
> abstime=0xfff800012de09f88, expected=0, futex_word=0x10001236574)
>     at ../sysdeps/unix/sysv/linux/futex-internal.h:205
> #1  0xfff80001034c5cdc in do_futex_wait (sem=0x3c,
> abstime=0xfff800012de09f88) at sem_waitcommon.c:111
> #2  0xfff80001034c5e00 in __new_sem_wait_slow (sem=0x10001236570,
> abstime=0xfff800012de09f88) at sem_waitcommon.c:181
> #3  0x000001000091dacc in qemu_sem_timedwait (sem=0x10001236570,
> ms=<optimized out>) at /home/pm215/qemu/util/qemu-thread-posix.c:289
> #4  0x000001000078ae28 in migration_thread (opaque=0x100012364a0) at
> /home/pm215/qemu/migration/migration.c:3125

So migration is still apparently running, it's rate-limiting 
using a timedwait; but 'ms' has been unhelpfully optimised out; could
it be stuck in here for some reason?

If I had this on my machine I'd try adding debug to make sure
ram_save_iterate is beign called/exiting OK, and just see how long this
delay in this timed_wait is being calculated as.

Dave

> #5  0x000001000091cdb0 in qemu_thread_start (args=0x1000175e730) at
> /home/pm215/qemu/util/qemu-thread-posix.c:502
> #6  0xfff80001034bbe68 in start_thread (arg=0xfff800012de0b900) at
> pthread_create.c:486
> #7  0xfff80001036c00b8 in __thread_start () at
> ../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:78
> 
> Thread 5 (Thread 0xfff80001139cb900 (LWP 127433)):
> #0  0xfff80001034c7e0c in __libc_recvmsg (flags=1073741824,
> msg=0xfff80001139c9930, fd=17) at
> ../sysdeps/unix/sysv/linux/recvmsg.c:28
> #1  0xfff80001034c7e0c in __libc_recvmsg (fd=<optimized out>,
> msg=0xfff80001139c9930, flags=<optimized out>) at
> ../sysdeps/unix/sysv/linux/recvmsg.c:25
> #2  0x00000100008ae9e0 in qio_channel_socket_readv (ioc=<optimized
> out>, iov=<optimized out>, niov=<optimized out>, fds=0x0, nfds=0x0,
> errp=0x0)
>     at /home/pm215/qemu/io/channel-socket.c:484
> #3  0x00000100008ab428 in qio_channel_readv_full (ioc=0x100019f9e00,
> iov=0xfff80001139c9b28, niov=1, fds=0x0, nfds=0x0, errp=0x0)
>     at /home/pm215/qemu/io/channel.c:65
> #4  0x00000100008ab594 in qio_channel_read (ioc=0x100019f9e00,
> buf=0x100012ad938 "", buflen=32768, errp=0x0) at
> /home/pm215/qemu/io/channel.c:216
> #5  0x000001000079bb58 in channel_get_buffer (opaque=0x100019f9e00,
> buf=0x100012ad938 "", pos=8, size=32768)
>     at /home/pm215/qemu/migration/qemu-file-channel.c:84
> #6  0x0000010000799d7c in qemu_fill_buffer (f=0x100012ad900) at
> /home/pm215/qemu/migration/qemu-file.c:297
> #7  0x000001000079abe4 in qemu_peek_byte (f=0x100012ad900,
> offset=<optimized out>) at /home/pm215/qemu/migration/qemu-file.c:557
> #8  0x000001000079aef0 in qemu_get_byte (f=0x100012ad900) at
> /home/pm215/qemu/migration/qemu-file.c:642
> #9  0x000001000079aef0 in qemu_get_be16 (f=0x100012ad900) at
> /home/pm215/qemu/migration/qemu-file.c:642
> #10 0x000001000078862c in source_return_path_thread
> (opaque=0x100012364a0) at /home/pm215/qemu/migration/migration.c:2206
> #11 0x000001000091cdb0 in qemu_thread_start (args=0x10001257600) at
> /home/pm215/qemu/util/qemu-thread-posix.c:502
> #12 0xfff80001034bbe68 in start_thread (arg=0xfff80001139cb900) at
> pthread_create.c:486
> #13 0xfff80001036c00b8 in __thread_start () at
> ../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:78
> 
> Thread 4 (Thread 0xfff8000112e83900 (LWP 125524)):
> #0  0x000001000039cdec in cpu_physical_memory_get_dirty
> (start=54124544, client=<optimized out>, length=1) at
> /home/pm215/qemu/include/exec/ram_addr.h:144
> #1  0x000001000039eec4 in cpu_physical_memory_get_dirty_flag
> (client=1, addr=54124544) at
> /home/pm215/qemu/include/exec/ram_addr.h:214
> #2  0x000001000039eec4 in cpu_physical_memory_is_clean (addr=54124544)
> at /home/pm215/qemu/include/exec/ram_addr.h:214
> #3  0x000001000039eec4 in tlb_set_page_with_attrs
>     (cpu=<optimized out>, vaddr=<optimized out>, paddr=<optimized
> out>, attrs=..., prot=<optimized out>, mmu_idx=<optimized out>,
> size=<optimized out>)
>     at /home/pm215/qemu/accel/tcg/cputlb.c:819
> #4  0x000001000050fb58 in arm_tlb_fill (cs=0x10001258b60,
> address=1127866368, access_type=<optimized out>, mmu_idx=<optimized
> out>, fi=<optimized out>)
>     at /home/pm215/qemu/target/arm/helper.c:11914
> #5  0x00000100004f6e1c in tlb_fill
>     (cs=0x10001258b60, addr=1127866368, size=<optimized out>,
> access_type=<optimized out>, mmu_idx=<optimized out>,
> retaddr=18444492278459273036)
>     at /home/pm215/qemu/target/arm/op_helper.c:206
> #6  0x000001000039f5a8 in helper_ret_ldub_mmu (env=0x10001260fd0,
> addr=1127866368, oi=<optimized out>, retaddr=18444492278459273036)
>     at /home/pm215/qemu/target/arm/cpu.h:919
> #7  0xfff8000110000f54 in code_gen_buffer ()
> #8  0x00000100003c0688 in cpu_tb_exec (itb=<optimized out>,
> cpu=<optimized out>) at /home/pm215/qemu/accel/tcg/cpu-exec.c:171
> #9  0x00000100003c0688 in cpu_loop_exec_tb (tb_exit=<synthetic
> pointer>, last_tb=<synthetic pointer>, tb=<optimized out>,
> cpu=<optimized out>)
>     at /home/pm215/qemu/accel/tcg/cpu-exec.c:618
> #10 0x00000100003c0688 in cpu_exec (cpu=<optimized out>) at
> /home/pm215/qemu/accel/tcg/cpu-exec.c:729
> #11 0x000001000036ae9c in tcg_cpu_exec (cpu=0x10001258b60) at
> /home/pm215/qemu/cpus.c:1429
> #12 0x000001000036daec in qemu_tcg_cpu_thread_fn (arg=0x10001258b60)
> at /home/pm215/qemu/cpus.c:1733
> #13 0x000001000036daec in qemu_tcg_cpu_thread_fn (arg=0x10001258b60)
> at /home/pm215/qemu/cpus.c:1707
> #14 0x000001000091cdb0 in qemu_thread_start (args=0x100012a6670) at
> /home/pm215/qemu/util/qemu-thread-posix.c:502
> #15 0xfff80001034bbe68 in start_thread (arg=0xfff8000112e83900) at
> pthread_create.c:486
> #16 0xfff80001036c00b8 in __thread_start () at
> ../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:78
> 
> Thread 3 (Thread 0xfff800010be17900 (LWP 125519)):
> #0  0xfff8000103657bd0 in __GI___libc_malloc (bytes=14) at malloc.c:3034
> #1  0xfff8000101984ef0 in g_malloc () at
> /usr/lib/sparc64-linux-gnu/libglib-2.0.so.0
> #2  0x0000010000910e14 in json_token (type=<optimized out>,
> x=<optimized out>, y=<optimized out>, tokstr=0x10001200520)
>     at /home/pm215/qemu/qobject/json-parser.c:565
> #3  0x000001000090f9a0 in json_message_process_token
> (lexer=0x100011b4af8, input=0x10001200520, type=<optimized out>,
> x=<optimized out>, y=<optimized out>)
>     at /home/pm215/qemu/qobject/json-streamer.c:82
> #4  0x000001000093d1c4 in json_lexer_feed_char (lexer=0x100011b4af8,
> ch=<optimized out>, flush=<optimized out>)
>     at /home/pm215/qemu/qobject/json-lexer.c:313
> #5  0x000001000093d340 in json_lexer_feed (lexer=0x100011b4af8,
> buffer=<optimized out>, size=<optimized out>) at
> /home/pm215/qemu/qobject/json-lexer.c:350
> #6  0x0000010000896de8 in qemu_chr_be_write_impl (len=<optimized out>,
> buf=0xfff800010be14d38 "{", s=0x100011ecc00) at
> /home/pm215/qemu/chardev/char.c:175
> #7  0x0000010000896de8 in qemu_chr_be_write (s=0x100011ecc00,
> buf=0xfff800010be14d38 "{", len=<optimized out>) at
> /home/pm215/qemu/chardev/char.c:187
> #8  0x000001000089f608 in tcp_chr_read (chan=<optimized out>,
> cond=<optimized out>, opaque=<optimized out>) at
> /home/pm215/qemu/chardev/char-socket.c:515
> #9  0x00000100008b111c in qio_channel_fd_source_dispatch
> (source=0xfff8000114005e40, callback=0x1000089f500 <tcp_chr_read>,
> user_data=0x100011ecc00)
>     at /home/pm215/qemu/io/channel-watch.c:84
> #10 0xfff800010197ed1c in g_main_context_dispatch () at
> /usr/lib/sparc64-linux-gnu/libglib-2.0.so.0
> #11 0xfff800010197f130 in  () at /usr/lib/sparc64-linux-gnu/libglib-2.0.so.0
> 
> Thread 2 (Thread 0xfff800010b615900 (LWP 125511)):
> #0  0xfff80001036bc040 in syscall () at
> ../sysdeps/unix/sysv/linux/sparc/sparc64/syscall.S:33
> #1  0x000001000091de6c in qemu_futex_wait (val=<optimized out>,
> f=<optimized out>) at /home/pm215/qemu/util/qemu-thread-posix.c:438
> #2  0x000001000091de6c in qemu_event_wait (ev=0x100010cc0c8
> <rcu_gp_event>) at /home/pm215/qemu/util/qemu-thread-posix.c:442
> #3  0x0000010000931e9c in wait_for_readers () at 
> /home/pm215/qemu/util/rcu.c:135
> #4  0x0000010000931e9c in synchronize_rcu () at 
> /home/pm215/qemu/util/rcu.c:171
> #5  0x00000100009322d4 in call_rcu_thread (opaque=<optimized out>) at
> /home/pm215/qemu/util/rcu.c:268
> #6  0x000001000091cdb0 in qemu_thread_start (args=0x100010dcc60) at
> /home/pm215/qemu/util/qemu-thread-posix.c:502
> #7  0xfff80001034bbe68 in start_thread (arg=0xfff800010b615900) at
> pthread_create.c:486
> #8  0xfff80001036c00b8 in __thread_start () at
> ../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:78
> 
> Thread 1 (Thread 0xfff800010006e6e0 (LWP 125450)):
> #0  0xfff80001036b57f4 in __GI_ppoll (fds=0x10001a08c90, nfds=6,
> timeout=<optimized out>, sigmask=0x0) at
> ../sysdeps/unix/sysv/linux/ppoll.c:39
> #1  0x0000010000917d6c in ppoll (__ss=0x0, __timeout=0x7feffae6a48,
> __nfds=6, __fds=0x10001a08c90) at
> /usr/include/sparc64-linux-gnu/bits/poll2.h:77
> #2  0x0000010000917d6c in qemu_poll_ns (fds=0x10001a08c90,
> nfds=<optimized out>, timeout=<optimized out>) at
> /home/pm215/qemu/util/qemu-timer.c:334
> #3  0x00000100009191cc in os_host_main_loop_wait (timeout=1000000000)
> at /home/pm215/qemu/util/main-loop.c:233
> #4  0x00000100009191cc in main_loop_wait (nonblocking=<optimized out>)
> at /home/pm215/qemu/util/main-loop.c:514
> #5  0x00000100005cbfc4 in main_loop () at /home/pm215/qemu/vl.c:1923
> #6  0x00000100003173a0 in main (argc=<optimized out>, argv=<optimized
> out>, envp=<optimized out>) at /home/pm215/qemu/vl.c:4578
> 
> 
> 
> Backtrace of process 125560:
> 
> Thread 4 (Thread 0xfff8000112e83900 (LWP 125642)):
> #0  0xfff8000103502bbc in futex_wait_cancelable (private=0,
> expected=0, futex_word=0x100012a665c) at
> ../sysdeps/unix/sysv/linux/futex-internal.h:88
> #1  0xfff8000103502bbc in __pthread_cond_wait_common (abstime=0x0,
> mutex=0x100010958a0 <qemu_global_mutex>, cond=0x100012a6630) at
> pthread_cond_wait.c:502
> #2  0xfff8000103502bbc in __pthread_cond_wait (cond=0x100012a6630,
> mutex=0x100010958a0 <qemu_global_mutex>) at pthread_cond_wait.c:655
> #3  0x000001000091d670 in qemu_cond_wait_impl
>     (cond=0x100012a6630, mutex=0x100010958a0 <qemu_global_mutex>,
> file=0x100009d4d80 "/home/pm215/qemu/cpus.c", line=<optimized out>)
>     at /home/pm215/qemu/util/qemu-thread-posix.c:161
> #4  0x000001000036b448 in qemu_wait_io_event (cpu=0x10001258b60) at
> /home/pm215/qemu/cpus.c:1242
> #5  0x000001000036daa8 in qemu_tcg_cpu_thread_fn (arg=0x10001258b60)
> at /home/pm215/qemu/cpus.c:1760
> #6  0x000001000036daa8 in qemu_tcg_cpu_thread_fn (arg=0x10001258b60)
> at /home/pm215/qemu/cpus.c:1707
> #7  0x000001000091cdb0 in qemu_thread_start (args=0x100012a6670) at
> /home/pm215/qemu/util/qemu-thread-posix.c:502
> #8  0xfff80001034fbe68 in start_thread (arg=0xfff8000112e83900) at
> pthread_create.c:486
> #9  0xfff80001037000b8 in __thread_start () at
> ../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:78
> 
> Thread 3 (Thread 0xfff800010be57900 (LWP 125637)):
> #0  0xfff80001036f56dc in __GI___poll (fds=0xfff8000114005650, nfds=3,
> timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:29
> #1  0xfff80001019bf0a0 in  () at /usr/lib/sparc64-linux-gnu/libglib-2.0.so.0
> 
> Thread 2 (Thread 0xfff800010b655900 (LWP 125629)):
> #0  0xfff80001036fc040 in syscall () at
> ../sysdeps/unix/sysv/linux/sparc/sparc64/syscall.S:33
> #1  0x000001000091de6c in qemu_futex_wait (val=<optimized out>,
> f=<optimized out>) at /home/pm215/qemu/util/qemu-thread-posix.c:438
> #2  0x000001000091de6c in qemu_event_wait (ev=0x100010cc040
> <rcu_call_ready_event>) at
> /home/pm215/qemu/util/qemu-thread-posix.c:442
> #3  0x00000100009322a4 in call_rcu_thread (opaque=<optimized out>) at
> /home/pm215/qemu/util/rcu.c:261
> #4  0x000001000091cdb0 in qemu_thread_start (args=0x100010dcc60) at
> /home/pm215/qemu/util/qemu-thread-posix.c:502
> #5  0xfff80001034fbe68 in start_thread (arg=0xfff800010b655900) at
> pthread_create.c:486
> #6  0xfff80001037000b8 in __thread_start () at
> ../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:78
> 
> Thread 1 (Thread 0xfff800010006e6e0 (LWP 125560)):
> #0  0xfff80001036f57f4 in __GI_ppoll (fds=0x10001a04080, nfds=7,
> timeout=<optimized out>, sigmask=0x0) at
> ../sysdeps/unix/sysv/linux/ppoll.c:39
> #1  0x0000010000917d6c in ppoll (__ss=0x0, __timeout=0x7feff8b49f8,
> __nfds=7, __fds=0x10001a04080) at
> /usr/include/sparc64-linux-gnu/bits/poll2.h:77
> #2  0x0000010000917d6c in qemu_poll_ns (fds=0x10001a04080,
> nfds=<optimized out>, timeout=<optimized out>) at
> /home/pm215/qemu/util/qemu-timer.c:334
> #3  0x00000100009191cc in os_host_main_loop_wait (timeout=1000000000)
> at /home/pm215/qemu/util/main-loop.c:233
> #4  0x00000100009191cc in main_loop_wait (nonblocking=<optimized out>)
> at /home/pm215/qemu/util/main-loop.c:514
> #5  0x00000100005cbfc4 in main_loop () at /home/pm215/qemu/vl.c:1923
> #6  0x00000100003173a0 in main (argc=<optimized out>, argv=<optimized
> out>, envp=<optimized out>) at /home/pm215/qemu/vl.c:4578
> 
> 
> thanks
> -- PMM
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]