Re: [Qemu-devel] TCP Segementation Offloading

qemu-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] TCP Segementation Offloading

From:	Michael S. Tsirkin
Subject:	Re: [Qemu-devel] TCP Segementation Offloading
Date:	Mon, 9 May 2016 15:12:58 +0300
On Fri, May 06, 2016 at 05:28:55PM +0100, Stefan Hajnoczi wrote:
> On Fri, May 06, 2016 at 06:34:33AM +0200, Ingo Krabbe wrote:
> > > On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
> > >> Good Mayday Qemu Developers,
> > >> 
> > >> today I tried to find a reference to a networking problem, that seems to 
> > >> be of quite general nature: TCP Segmentation Offloading (TSO) in virtual 
> > >> environments.
> > >> 
> > >> When I setup TAP network adapter for a virtual machine and put it into a 
> > >> host bridge, the known best practice is to manually set "tso off gso 
> > >> off" with ethtool, for the guest driver if I use a hardware emulation, 
> > >> such as e1000 and/or "tso off gso off" for the host driver and/or for 
> > >> the bridge adapter, if I use the virtio driver, as otherwise you 
> > >> experience (sometimes?) performance problems or even lost packages.
> > > 
> > > I can't parse this sentence.  In what cases do you think it's a "known
> > > best practice" to disable tso and gso?  Maybe a table would be a clearer
> > > way to communicate this.
> > > 
> > > Can you provide a link to the source claiming tso and gso should be
> > > disabled?
> > 
> > Sorry for that long sentence. The consequence seems to be, that it is most 
> > stable to turn off tso and gso for host bridges and for adapters in virtual 
> > machines.
> > 
> > One of the most comprehensive collections of arguments is this article
> > 
> >     
> > https://kris.io/2015/10/01/kvm-network-performance-tso-and-gso-turn-it-off/
> > 
> > while I also found a documentation for Centos 6
> > 
> >     
> > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/ch10s04.html
> 
> This documentation is about (ancient) RHEL 3.9 guests.  I would not
> apply anything on that page to modern Linux distro releases without
> re-checking.

I think this refers to
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.5_Technical_Notes/kernel.html

which lists a couple of TSO bugs.

These should have been addressed by now, and I don't see anything
like this in RHEL7 docs.


> > 
> > In google groups this one is discussed
> > 
> >     https://code.google.com/p/ganeti/wiki/PerformanceTuning
> > 
> > Of course the same is found for Xen Machines
> > 
> >     http://cloudnull.io/2012/07/xenserver-network-tuning/
> > 
> > You see there are several Links in the internet and my first question is: 
> > Why can't I find this discussion in the qemu-wiki space.
> > 
> > I think the bug
> > 
> >     https://bugs.launchpad.net/bugs/1202289
> > 
> > is related.
> 
> Thanks for posting all the links!
> 
> I hope Michael and/or Jason explain the current status for RHEL 6/7 and
> other modern distros.  Maybe they can also follow up with the kris.io
> blog author if an update to the post is necessary.
> 
> TSO/GSO is enabled by default on my Fedora and RHEL host/guests.  If it
> was a best practice for those distros I'd expect the default settings to
> reflect that.  Also, I would be surprised if the offload features were
> bad since work was put into supporting and extending them in virtio-net
> over the years.

The unfortunate side-effect of documenting work-arounds is that people
get used to using them. TSO, s/g and checksum offloads are advanced
features, as such there's always a chance that using them makes you hit
a bug.  Enabling them gives better performance for most users so I think
that our defaults are good.


> > >> I haven't found a complete analysis of the background of these problems, 
> > >> but there seem to be some effects on MTU based fragmentation and UDP 
> > >> checksums.
> > >> 
> > >> There is a tso related bug on launchpad, but the context of this bug is 
> > >> too narrow, for the generality of the problem.
> > >> 
> > >> Also it seems that there is a problem in LXC contexts too (I found such 
> > >> a reference, without detailed description in a Post about Xen setup).
> > >> 
> > >> My question now is: Is there a bug in the driver code and shouldn't this 
> > >> be documented somewhere in wiki.qemu.org? Where there developments about 
> > >> this topic in the past or is there any planned/ongoing work todo on the 
> > >> qemu drivers?
> > >> 
> > >> Most problem reports found relate to deprecated Centos6 qemu-kvm 
> > >> packages.
> > >> 
> > >> In our company we have similar or even worse problems with Centos7 hosts 
> > >> and guest machines.
> > > 
> > > Have haven't explained what problem you are experiencing.  If you want
> > > help with your setup please include your QEMU command-line (ps aux |
> > > grep qemu), the traffic pattern (ideally how to reproduce it with a
> > > benchmarking tool), and what observation you are making (e.g. netstat
> > > counters showing dropped packets).
> > 
> > I was quite astonished about the many hints about virtio drivers as we had 
> > this problem with the e1000 driver in a Centos7 Guest on a Centos6 Host.
> > 
> >     e1000 0000:00:03.0 ens3: Detected Tx Unit Hang#012  Tx Queue            
> >  <0>#012  TDH                  <42>#012  TDT                  <42>#012  
> > next_to_use          <2e>#012  next_to_clean        
> > <42>#012buffer_info[next_to_clean]#012  time_stamp           
> > <104aff1b8>#012  next_to_watch        <44>#012  jiffies              
> > <104b00ee9>#012  next_to_watch.status <0>
> >     Apr 25 21:08:48 db03 kernel: ------------[ cut here ]------------
> >     Apr 25 21:08:48 db03 kernel: WARNING: at net/sched/sch_generic.c:297 
> > dev_watchdog+0x270/0x280()
> >     Apr 25 21:08:48 db03 kernel: NETDEV WATCHDOG: ens3 (e1000): transmit 
> > queue 0 timed out
> >     Apr 25 21:08:48 db03 kernel: Modules linked in: binfmt_misc ipt_REJECT 
> > nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip6t_REJECT 
> > nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter 
> > ip6_tables btrfs zlib_deflate raid6_pq xor ext4 mbcache jbd2 crc32_pclmul 
> > ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper 
> > i2c_piix4 ppdev cryptd pcspkr virtio_balloon parport_pc parport sg nfsd 
> > auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod 
> > crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi cirrus 
> > syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crct10dif_pclmul 
> > crct10dif_common ata_piix crc32c_intel virtio_pci e1000 i2c_core 
> > virtio_ring libata serio_raw virtio floppy dm_mirror dm_region_hash dm_log 
> > dm_mod
> >     Apr 25 21:08:48 db03 kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 
> > 3.10.0-327.13.1.el7.x86_64 #1
> >     Apr 25 21:08:48 db03 kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 
> > 01/01/2007
> >     Apr 25 21:08:48 db03 kernel: ffff88126f483d88 685d892e8a452abb 
> > ffff88126f483d40 ffffffff8163571c
> >     Apr 25 21:08:48 db03 kernel: ffff88126f483d78 ffffffff8107b200 
> > 0000000000000000 ffff881203b9a000
> >     Apr 25 21:08:48 db03 kernel: ffff881201c3e080 0000000000000001 
> > 0000000000000002 ffff88126f483de0
> >     Apr 25 21:08:48 db03 kernel: Call Trace:
> >     Apr 25 21:08:48 db03 kernel: <IRQ>  [<ffffffff8163571c>] 
> > dump_stack+0x19/0x1b
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8107b200>] 
> > warn_slowpath_common+0x70/0xb0
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8107b29c>] 
> > warn_slowpath_fmt+0x5c/0x80
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8154cd40>] 
> > dev_watchdog+0x270/0x280
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? 
> > dev_graft_qdisc+0x80/0x80
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8108b0a6>] 
> > call_timer_fn+0x36/0x110
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? 
> > dev_graft_qdisc+0x80/0x80
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8108dd97>] 
> > run_timer_softirq+0x237/0x340
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff81084b0f>] 
> > __do_softirq+0xef/0x280
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff816477dc>] call_softirq+0x1c/0x30
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff81016fc5>] do_softirq+0x65/0xa0
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff81084ea5>] irq_exit+0x115/0x120
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff81648455>] 
> > smp_apic_timer_interrupt+0x45/0x60
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff81646b1d>] 
> > apic_timer_interrupt+0x6d/0x80
> >     Apr 25 21:08:48 db03 kernel: <EOI>  [<ffffffff81058e96>] ? 
> > native_safe_halt+0x6/0x10
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff8101e4d6>] 
> > arch_cpu_idle+0x26/0x30
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff810d6325>] 
> > cpu_startup_entry+0x245/0x290
> >     Apr 25 21:08:48 db03 kernel: [<ffffffff810475fa>] 
> > start_secondary+0x1ba/0x230
> >     Apr 25 21:08:48 db03 kernel: ---[ end trace 71ac4360272e207e ]---
> >     Apr 25 21:08:48 db03 kernel: e1000 0000:00:03.0 ens3: Reset adapter
> > 
> > 
> > I'm still not sure why this happens on this host "db03", while db02 and 
> > db01 are not affected. All guests are running on different hosts and the 
> > network is controlled by an openvswitch.
> 
> This looks interesting.  It could be a bug in QEMU's e1000 NIC
> emulation.  Maybe it has already been fixed in qemu.git but I didn't see
> any relevant commits.
> 
> Please post the RPM version numbers you are using (rpm -qa | grep qemu
> in host, rpm -qa | grep kernel in host).
> 
> The e1000 driver can print additional information (to dump the contents
> of the tx ring).  Please increase your kernel's log level to collect
> that information:
>  # echo 8 >/proc/sys/kernel/printk
> 
> The tx ring dump may allow someone to figure out why the packet caused
> tx to stall.
> 
> Stefan
[Prev in Thread]
Current Thread
[Next in Thread]
[Qemu-devel] TCP Segementation Offloading, Ingo Krabbe, 2016/05/01
- Re: [Qemu-devel] TCP Segementation Offloading, Stefan Hajnoczi, 2016/05/05
  - Re: [Qemu-devel] TCP Segementation Offloading, Ingo Krabbe, 2016/05/06
    - Re: [Qemu-devel] TCP Segementation Offloading, Stefan Hajnoczi, 2016/05/06
    - Re: [Qemu-devel] TCP Segementation Offloading, Michael S. Tsirkin <=
Prev by Date: Re: [Qemu-devel] [RFC PATCH v3 1/3] vGPU Core driver
Next by Date: [Qemu-devel] [V10 0/4] AMD IOMMU
Previous by thread: Re: [Qemu-devel] TCP Segementation Offloading
Next by thread: [Qemu-devel] [PULL 0/1] acpi: last minute fix for 2.6
Index(es):
- Date
- Thread