qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX)


From: Wei Xu
Subject: Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX)
Date: Thu, 4 Jan 2018 11:09:12 +0800
User-agent: NeoMutt/20170113-14-7f1397-dirty (1.7.2)

On Wed, Jan 03, 2018 at 04:07:44PM +0100, Stefan Priebe - Profihost AG wrote:
> 
> Am 03.01.2018 um 04:57 schrieb Wei Xu:
> > On Tue, Jan 02, 2018 at 10:17:25PM +0100, Stefan Priebe - Profihost AG 
> > wrote:
> >>
> >> Am 02.01.2018 um 18:04 schrieb Wei Xu:
> >>> On Tue, Jan 02, 2018 at 04:24:33PM +0100, Stefan Priebe - Profihost AG 
> >>> wrote:
> >>>> Hi,
> >>>> Am 02.01.2018 um 15:20 schrieb Wei Xu:
> >>>>> On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG 
> >>>>> wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> currently i'm trying to fix a problem where we have "random" missing
> >>>>>> packets.
> >>>>>>
> >>>>>> We're doing an ssh connect from machine a to machine b every 5 minutes
> >>>>>> via rsync and ssh.
> >>>>>>
> >>>>>> Sometimes it happens that we get this cron message:
> >>>>>> "Connection to 192.168.0.2 closed by remote host.
> >>>>>> rsync: connection unexpectedly closed (0 bytes received so far) 
> >>>>>> [sender]
> >>>>>> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2]
> >>>>>> ssh: connect to host 192.168.0.2 port 22: Connection refused"
> >>>>>
> >>>>> Hi Stefan,
> >>>>> What kind of virtio-net backend are you using? Can you paste your qemu
> >>>>> command line here?
> >>>>
> >>>> Sure netdev part:
> >>>> -netdev
> >>>> type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on
> >>>> -device
> >>>> virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300
> >>>> -netdev
> >>>> type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4
> >>>> -device
> >>>> virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301
> >>>
> >>> According to what you have mentioned, the traffic is not heavy for the 
> >>> guests,
> >>> the dropping shouldn't happen for regular case.
> >>
> >> The avg traffic is around 300kb/s.
> >>
> >>> What is your hardware platform?
> >>
> >> Dual Intel Xeon E5-2680 v4
> >>
> >>> and Which versions are you using for both
> >>> guest/host kernel
> >> Kernel v4.4.103
> >>
> >>> and qemu?
> >> 2.9.1
> >>
> >>> Are there other VMs on the same host?
> >> Yes.
> > 
> > What about the CPU load? 
> 
> Host:
> 80-90% Idle
> LoadAvg: 6-7
> 
> VM:
> 97%-99% Idle
> 

OK, then this shouldn't be a concern.

> >>>>> 'Connection refused' usually means that the client gets a TCP Reset 
> >>>>> rather
> >>>>> than losing packets, so this might not be a relevant issue.
> >>>>
> >>>> Mhm so you mean these might be two seperate ones?
> >>>
> >>> Yes.
> >>>
> >>>>
> >>>>> Also you can do a tcpdump on both guests and see what happened to SSH 
> >>>>> packets
> >>>>> (tcpdump -i tapXXX port 22).
> >>>>
> >>>> Sadly not as there's too much traffic on that part as rsync is syncing
> >>>> every 5 minutes through ssh.
> >>>
> >>> You can do a tcpdump for the entire traffic from the guest and host and 
> >>> compare
> >>> what kind of packets are dropped if the traffic is not overloaded.
> >>
> >> Are you sure? I don't get why the same amount and same kind of packets
> >> should be received by both tap which are connected to different bridges
> >> to different HW and physical interfaces.
> > 
> > Exactly, possibly this would be a host or guest kernel bug cos than qemu 
> > issue
> > you are using vhost kernel as the backend and the two stats are independent,
> > you might have to check out what is happening inside the traffic.
> 
> What do you mean by inside the traffic?

You might need to figure what kind of packets are dropped on host tap interface,
are they random packets or specific packets?

There are few other tests which help to see what happened besides triaging
the traffic, or you can try alternative tests according to your test bed.

1). Upgrade host & guest kernel to latest kernel and see if it comes up, you can
use net-next tree.
    git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git

2). Do some traffic throughput(netperf, iperf, etc) on both guests(traffic from 
guest to host if the guests are isolated due to your comments) and check out
the statistics.

Wei

> 
> Stefan
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]