qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets


From: Alexander Graf
Subject: Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets
Date: Tue, 22 Nov 2011 13:03:36 +0100

On 29.09.2011, at 18:06, Amit Shah wrote:

> On (Wed) 03 Aug 2011 [13:24:22], Jan Kiszka wrote:
>> From: Fabien Chouteau <address@hidden>
>> 
>> In the current implementation, if Slirp tries to send an IP packet to a 
>> client
>> with an unknown hardware address, the packet is simply dropped and an ARP
>> request is sent (if_encap in slirp/slirp.c).
>> 
>> With this patch, Slirp will send the ARP request, re-queue the packet and try
>> to send it later. The packet is dropped after one second if the ARP reply is
>> not received.
> 
> This patch causes a segfault when guests wake up from hibernate.
> 
> Recipe:
> 1. Start guest with -net user -net nic,model=virtio
> 2. (guest) ping 10.0.2.2
> 3. (guest) echo "disk" > /sys/power/state
> 4. Re-start guest with same command line
> 5. Ping has stopped receiving replies.
> 6. Kill that ping process and start a new one.  qemu segfaults.
> 
> This needs the not-upstream-yet virtio S4 handling patches, found at
> http://thread.gmane.org/gmane.linux.kernel/1197141
> 
> The backtrace is:
> 
> (gdb) bt
> #0  0x00007ffff7e421f7 in slirp_insque (a=0x0, b=0x7ffff8f95d50) at
> /home/amit/src/qemu/slirp/misc.c:27
> #1  0x00007ffff7e40738 in if_start (slirp=0x7ffff8a9cdf0) at
> /home/amit/src/qemu/slirp/if.c:194
> #2  0x00007ffff7e44828 in slirp_select_poll (readfds=0x7fffffffd930,
> writefds=0x7fffffffd9b0, xfds=0x7fffffffda30, select_error=0)
>    at /home/amit/src/qemu/slirp/slirp.c:588
> #3  0x00007ffff7e110f1 in main_loop_wait (nonblocking=<optimized out>)
> at /home/amit/src/qemu/vl.c:1549
> #4  0x00007ffff7d7dc47 in main_loop () at
> /home/amit/src/qemu/vl.c:1579
> #5  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized
> out>) at /home/amit/src/qemu/vl.c:3574
> 
> 
> Reverting the patch keeps the ping going on after resume.  

I get the same thing with yesterday's HEAD (close to 1.0-rc3), but without 
hibernation.

I'm running KVM Autotest on PPC machines to check my ppc-next queue and every 
single test failed for me because of segmentation faults in the slirp code. 
Reverting this patch (and the follow-up patch which fixes the struct mbuf 
definition) makes all tests not segfault for me, so I'm fairly sure this is the 
offending one :).

I'm not saying that the patch is actually wrong - maybe it only exposes another 
bug that was only hidden so far. Either way, the breakage looks pretty much 
like memory corruption to me.

Also, I'm having a hard time reproducing the problem manually. It triggers 
every time in Autotest, but never when I try to trigger it manually. 
Essentially Autotest is merely trying to connect to the guest using ssh every 
couple of seconds, so I don't know why I can't reproduce it without it.

Please fix or revert this for 1.0.


Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]