qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] qemu: async sending in tap causes "NFS not responding" erro


From: Scott Tsai
Subject: [Qemu-devel] qemu: async sending in tap causes "NFS not responding" error
Date: Sun, 25 Oct 2009 02:12:31 +0800

Dear all,

I recently found that this chageset:
http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=e19eb22486f258a421108ac22b8380a4e2f16b97
"net: make use of async packet sending API in tap client"
causes NFS root Linux guest setups using TAP networking to fail with
error messages like:
nfs: server 172.20.0.1 not responding, still trying
nfs: server 172.20.0.1 OK
< .... repeat infinitely ...>
This happens on both the "master" and "stable-0.11" branches on qemu.

The attached '0001-net-revert-e19eb22486f258a421108ac22b8380a4e2f16b97.patch'
makes NFS root on qemu emulated "arm-integrator-cp" boards work for me
again.

I've uploaded wireshark captures of qemu-0.10(good, nfsroot works) and
qemu-0.11(bad) here:
http://scottt.tw/bug/qemu-async-tap-drops-packets/qemu-nfsroot-good.pcap
http://scottt.tw/bug/qemu-async-tap-drops-packets/qemu-nfsroot-bad.pcap

Inspecting frame 268 in "qemu-nfsroot-bad.pcap", I see:
"ICMP Fragment reassembly time exceeded", reply to request in frame
53, duplicate to the reply in frame 56
and suspect qemu is dropping ethernet frames from larger, fragmented
IP packets used for NFS READ replies.

After finding:
http://lists.gnu.org/archive/html/qemu-devel/2009-09/msg01173.html
through Google and reading through the potentially bad commits that
Sven found through bisection,
I patched "tap_send()" to not run in a loop ("drain the tap send queue
in one go"?) and the error goes away.

To reproduce this, download "zImage", "scripts/" and "trigger-bug" from:
http://scottt.tw/bug/qemu-async-tap-drops-packets/
and run the "trigger-bug" script

I've only just started reading
linux/Documentation/networking/tuntap.txt after encountering this
problem and currently find the code
called from "tap_send()" ex: qemu_send_packet_async,
qemu_deliver_packet and the semantics of their return values pretty
confusing.
I'm sure my patch should be refined to both make both NFS root and the
originally intended optimization work.

Attachment: 0001-net-revert-e19eb22486f258a421108ac22b8380a4e2f16b97.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]