qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] Intermittent e1000 failure on qemu-kvm 1.0
Date: Thu, 12 Apr 2012 11:48:35 +0100

On Thu, Apr 12, 2012 at 10:37 AM, Chris Webb <address@hidden> wrote:
> Stefan Hajnoczi <address@hidden> writes:
>
>> On Tue, Apr 3, 2012 at 5:37 PM, Chris Webb <address@hidden> wrote:
>> > Stefan Hajnoczi <address@hidden> writes:
>> >
>> >
>> >> >> Are you sure no other guest has the same MAC address or IP address?
>> >> >> This weird behavior sounds similar to what happens when you have
>> >> >> multiple devices on a network using the same address - the results are
>> >> >> very confusing :).
>> >> >
>> >> > Yes, I agree! However, in this case there's no other guest with the 
>> >> > same MAC
>> >> > or IP address on the network. I've explicitly rechecked this to be 
>> >> > sure, and
>> >> > also deliberately varied the MAC address to something I know can't be
>> >> > generated by our scripts. In any case, I'm using the same MAC and IP 
>> >> > address
>> >> > for every reboot of this VM, and usually (19 times out of 20) it works 
>> >> > fine.
>> >>
>> >> The lack of ARP reply is a host networking problem. ?Have you checked
>> >> host dmesg(1) output just in case there was a kernel message related
>> >> to this?
>> >
>> > Nothing there I'm afraid. Just the usual
>> >
>> > ?device tap1 entered promiscuous mode
>> > ?ADDRCONF(NETDEV_UP): tap1: link is not ready
>> > ?ADDRCONF(NETDEV_CHANGE): tap1: link becomes ready
>> > ?br0: port 2(tap1) entering forwarding state
>> > ?br0: port 2(tap1) entering forwarding state
>> > ?kvm: 20288: cpu0 unhandled rdmsr: 0xc0010112
>> > ?kvm: 20288: cpu0 unhandled rdmsr: 0xc0010048
>> > ?tap1: no IPv6 routers present
>> > ?br0: port 2(tap1) entering forwarding state
>> > ?br0: port 2(tap1) entering forwarding state
>> > ?br0: port 2(tap1) entering forwarding state
>> > ?br0: port 2(tap1) entering forwarding state
>> > ?br0: port 2(tap1) entering disabled state
>> >
>> > cycle. It looks just the same for a working guest as for a non-working
>> > guest.
>>
>> Is the "disabled state" because QEMU exited?
>
> Yes, that's right.
>
>> I'm afraid I don't have any suggestions beyond debugging the
>> bridge->tap code in the kernel since packets are not being forwarded
>> for some reason.
>
> Many thanks for your help and suggestions nonetheless. It reassuring to hear
> it's not something completely obvious I'm overlooking.
>
> Does the fact that this only happens with model=e1000, not model=virtio or
> model=rtl8139 give us a clue as to what might be going wrong in the host
> kernel? The observation which particularly baffles me if it's a host kernel
> issue is that removing -usbdevice tablet from the guest makes the problem go
> away!
>
> More generally, my confusion with this bug is that guest changes like
> model=e1000 -> model=rtl8139 fixing it or removing -usbdevice tablet fixing
> it seem to imply a qemu problem rather than a host kernel bug, but -net tap
> -> -net user fixing it seems to imply a host kernel bug rather than a qemu
> problem!

Yes, it's odd that QEMU changes make the issue go away but tcpdump
suggests the packet is not being sent from the bridge to the tap
device.

e1000 and rtl8139 both use the same QEMU network subsystem code.  I
don't see an obvious difference between the two.

Since this issue only happens once in many QEMU runs are you sure that
-usbdevice tablet really makes the issue go away?

Are you using ebtables?  I know you mentioned disabling iptables and
it would be good to try the same for ebtables if you use it.

In order to debug the host networking issue you may be able to use
ebtables/iptables LOG targets to collect information on how far
exactly the packets are getting.  For example, you could try logging
all packets destined for the guest MAC address - and if the log
information includes the network interface you should see the packet
move between its source, the bridge, and the destination interface.  I
have never tried this but it might work.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]