qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/3] Unified Datagram Socket Transport - raw sup


From: Jason Wang
Subject: Re: [Qemu-devel] [PATCH 3/3] Unified Datagram Socket Transport - raw support
Date: Mon, 24 Jul 2017 12:03:06 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1



On 2017年07月22日 02:50, Anton Ivanov wrote:

[snip]

+    "-netdev raw,id=str,ifname=ifname\n"
+ " configure a network backend with ID 'str' connected to\n" + " an Ethernet interface named ifname via raw socket.\n" + " This backend does not change the interface settings.\n" + " Most interfaces will require being set into promisc mode,\n" + " as well having most offloads (TSO, etc) turned off.\n" + " Some virtual interfaces like tap support only RX.\n"

Pay attention that qemu supports vnet header. So any reason to turn off e.g TSO here?

I am not aware of any means to get extra info like checksums, etc show up on raw socket read.

If you know a way to make them show up, this is worth investigating.

See packet_rcv_vnet(). But a known 'issue' for raw socket is that it forbids change vnet header length after creation, we may need some workaround in qemu.



  #endif
"-netdev socket,id=str[,fd=h][,listen=[host]:port][,connect=host:port]\n" " configure a network backend to connect to another network\n" @@ -2463,6 +2470,32 @@ qemu-system-i386 linux.img -net nic -net gre,src=4.2.3.1,dst=1.2.3.4
    @end example
  address@hidden -netdev raw,address@hidden,address@hidden
address@hidden -net raw[,address@hidden,address@hidden,address@hidden
+Connect VLAN @var{n} directly to an Ethernet interface using raw socket.
+
+This transport allows a VM to bypass most of the network stack which is
+extremely useful for tapping.
+
address@hidden address@hidden
+    interface name (mandatory)
+
address@hidden
+# set up the interface - put it in promiscuous mode and turn off offloads
+ifconfig eth0 up
+ifconfig eth0 promisc
+
+/sbin/ethtool -K eth0 gro off
+/sbin/ethtool -K eth0 tso off
+/sbin/ethtool -K eth0 gso off
+/sbin/ethtool -K eth0 tx off

Any reason to turn off tx here?

Yes - we already have it computed and we have written it as is as a whole packet. You do not want it re-computed as at least some adapters do silly things if you start writing raw and the checksum already exists.

This looks like a bug of the driver?

For GRO it's easier to understand since guest may not handle big packets with partial checksum. But tso,gso,tx, this still looks questionable for the nic which may want to offload them to card (e.g virtio-net).


Once again, this one of the pros/cons of using tpacket vs recv/send (with or without mmsg) on a raw socket.

recvm(m)sg/sendm(m)sg are brute force as far as offloads, but things like scatter/gather work correctly so there are little copies.

Compared to that, tpacket will allow you some access to checksumming which you can map onto checksum offload in a vNIC. As a payback for this you end up copying in more cases than for send/recvmmsg and you pay penalty for timestamping if you do not have a hardware timestamp source in the NIC.

The other issue I always had with tpacket is that you "see" your own packets so you have to manage a RX side BPF filter which removes those so you do not see your own packets.

Don't get here, looks like I don't get this 'issue'. Anyway we can discuss this when I post the tpacket backend.

Thanks.

That can get quite interesting if you have a lot of MACs on a NIC (f.e. when there are multicast apps). Not sure if this is still the case - it definitely was in mid 3.x Linux kernels. If you use raw sendm(m)sg there is no issue - the packets are not looped when writing to physical interfaces.


+
+# launch QEMU instance - if your network has reorder or is very lossy add ,pincounter
+
+qemu-system-i386 linux.img -net nic -net raw,ifname=eth0

Can we switch to use -netdev here?

This is done in the new revisions.


Thanks

+
address@hidden example
+
@item -netdev vde,address@hidden,address@hidden,address@hidden,address@hidden,address@hidden @itemx -net vde[,address@hidden,address@hidden,address@hidden [,address@hidden,address@hidden,address@hidden Connect VLAN @var{n} to PORT @var{n} of a vde switch running on host and






reply via email to

[Prev in Thread] Current Thread [Next in Thread]