qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups


From: Jan Kiszka
Subject: Re: [Qemu-devel] [PATCH 0/4] slirp: Fix for requeuing crash, cleanups
Date: Wed, 29 Feb 2012 22:52:33 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2012-02-29 22:48, Stefan Weil wrote:
> Am 29.02.2012 22:33, schrieb Jan Kiszka:
>> On 2012-02-29 22:00, Stefan Weil wrote:
>>> Am 29.02.2012 20:15, schrieb Jan Kiszka:
>>>> This is an alternative, more complete approach to fix the requeuing-
>>>> related crashes reported recently. See patch 2 for details. The rest
>>>> are
>>>> simple cleanups.
>>>>
>>>> Please check carefully if I messed something up.
>>>>
>>>
>>> Hi Jan,
>>>
>>> here is the result of MIPS Malta with your patch series applied:
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> 0x000055555577db5b in slirp_remque (a=0x555556cff360) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
>>> 39 ((struct quehead *)(element->qh_rlink))->qh_link =
>>> element->qh_link;
>>> (gdb) i s
>>> #0 0x000055555577db5b in slirp_remque (a=0x555556cff360) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/misc.c:39
>>> #1 0x000055555577b7a2 in if_start (slirp=0x5555564bfb80) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:208
>>> #2 0x000055555577b607 in if_output (so=0x555556ea0b70,
>>> ifm=0x555556cff9e0) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/if.c:139
>>> #3 0x000055555577d040 in ip_output (so=0x555556ea0b70,
>>> m0=0x555556cff9e0) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/ip_output.c:84
>>> #4 0x00005555557865d6 in tcp_output (tp=0x555556ea0c20) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/tcp_output.c:456
>>> #5 0x000055555577ff5a in slirp_select_poll (readfds=0x7fffffffda10,
>>> writefds=0x7fffffffda90, xfds=0x7fffffffdb10, select_error=0)
>>> at /home/stefan/src/qemu/repo.or.cz/qemu/ar7/slirp/slirp.c:480
>>> #6 0x000055555572d8c0 in main_loop_wait (nonblocking=0) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/main-loop.c:469
>>> #7 0x0000555555721a61 in main_loop () at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:1558
>>> #8 0x00005555557284a2 in main (argc=25, argv=0x7fffffffdfe8,
>>> envp=0x7fffffffe0b8) at
>>> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/vl.c:3667
>>> (gdb) p element
>>> $1 = (struct quehead *) 0x555556cff360
>>> (gdb) p *element
>>> $2 = {qh_link = 0x555556cff360, qh_rlink = 0x0}
>>> (gdb) p (struct quehead *)(element->qh_rlink)
>>> $3 = (struct quehead *) 0x0
>>
>> Hmm. Two options:
>>
>> - you try to debug what happens to that mbuf, why its queue anchors
>> get corrupted (maybe while in if_encap?)
>> - you tell me how to reproduce it (image file, host characteristics)
>>
>> Jan
> 
> I'm afraid that the first variant won't happen this or next week
> because lack of time.
> 
> This is my test environment:
> 
> Debian Squeeze x86_64 host, Debian Squeeze mips guest.
> 
> I use NFS root, and the latest crash happened during boot.
> All other crashes happened after the guest had booted
> when I startet apt-get update, so maybe booting from a
> Debian CDROM might also reproduce the crash.
> 
> I compiled QEMU with a default configuration, but used
> CFLAGS=-g (no optimization) and startet QEMU like this:
> 
> gdb --args
> /home/stefan/src/qemu/repo.or.cz/qemu/ar7/bin/debug/x86/mips-softmmu/qemu-system-mips
> --kernel /tftpboot/malta/boot/vmlinux-2.6.26-2-4kc-malta --initrd
> /tftpboot/malta/boot/initrd.img-2.6.26-2-4kc-malta --append "debug
> nohz=off root=/dev/nfs rw ip=::::malta::dhcp
> nfsroot=10.0.2.2:/tftpboot/malta -bootp abc -tftp /tftpboot/malta" -M
> malta --cpu 4KEc -m 256 --net nic,model=pcnet --net user,hostname=malta
> --redir tcp:5800::5800 --redir tcp:5900::5900 --redir tcp:10022::22
> --redir tcp:10080::80
> 
> Kernel and initrd are from Debian Squeeze (mips).

OK, thanks.

Here is a last shot (on top of my queue) before I try to reproduce:

diff --git a/slirp/if.c b/slirp/if.c
index 90bf398..d3bdf58 100644
--- a/slirp/if.c
+++ b/slirp/if.c
@@ -181,13 +181,12 @@ void if_start(Slirp *slirp)
         from_batchq = from_batchq_next;
 
         ifm_next = ifm->ifq_next;
-        if (!from_batchq) {
-            if (ifm_next == &slirp->if_fastq) {
-                /* No more packets in fastq, switch to batchq */
-                ifm_next = slirp->next_m;
-                from_batchq_next = true;
-            }
-        } else if (ifm_next == &slirp->if_batchq) {
+        if (ifm_next == &slirp->if_fastq) {
+            /* No more packets in fastq, switch to batchq */
+            ifm_next = slirp->next_m;
+            from_batchq_next = true;
+        }
+        if (ifm_next == &slirp->if_batchq) {
             /* end of batchq */
             ifm_next = NULL;
         }

> 
> I had no slirp problems with that test environment during the last two
> years.

Yes, these regression here are unfortunate. Hope we can resolve them
quickly.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]