qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-stable] [ovs-dev] If 1 KVM Guest loads the virtio-pci, on top


From: Martinx - ジェームズ
Subject: Re: [Qemu-stable] [ovs-dev] If 1 KVM Guest loads the virtio-pci, on top of dpdkvhostuser OVS socket interface, it slows down everything!
Date: Mon, 30 May 2016 13:29:48 -0400

Answers inline, as follows:

On 30 May 2016 at 12:44, Bodireddy, Bhanuprakash <address@hidden> wrote:

From: Martinx - ジェームズ [mailto:address@hidden]
Sent: Monday, May 30, 2016 5:01 PM
To: Bodireddy, Bhanuprakash <address@hidden>
Cc: Christian Ehrhardt <address@hidden>; <address@hidden> <address@hidden>; dev <address@hidden>; address@hidden
Subject: Re: [ovs-dev] If 1 KVM Guest loads the virtio-pci, on top of dpdkvhostuser OVS socket interface, it slows down everything!

 

Hello Bhanu,

 

 I'm a little bit confused, you said that the problem can be fixed but, later, you also said that:

 

 "On a Multi VM setup even with the above patch applied, one might see aggregate throughput difference when vNIC is bind to igb_uio vs virtio-pci"...

 

 My idea is to use OVS with DPDK in a multi-vm environment but, based on your answer, this is not possible, because the VM A, can interfere with VM B... Is that true even with that patch applied, can you confirm this?

 

[BHANU] With the patch applied the issue should be fixed.  Without the patch,  VM A can interfere with VM B when VM A isn’t processing its queues which eventually triggers vhost send retries until timeout (100ms in this case). This cause the pmd thread to slow down and that affects the other Virtual Machines(VM B) on the host as happened in your case. The patch that I pointed will remove the retry logic completely.


Sounds cool! I'll try that.
 

 

Also in your case, packets are sent to the idle VM (no packets drained from the virt queues inside) which triggered the issue and affected the neighboring VMs. consider sending the traffic to the newly booted VM after the forwarding is enabled inside the guest.


This looks impossible to do in a dynamic environment... Specially if using OpenStack with OVS + DPDK at the Compute Nodes... Customers will launch their NFV (DPDK Apps) and the traffic will be already there... I can't see a way of "stopping traffic" if the l2 forward becomes down, or if customer reboots its Instance (that will bring virtio-pci momentarily).

Even on KVM-Only environments, this looks very hard to do and a nightmare to manage...

How can I tell a customer that, before running "service my-l2-dpdk-app stop", he needs to communicate the "cloud provider", or to access the KVM-Host to manage OVS+DPDK as root, to stop sending traffic to the VM...

 

 I don't think that diverting the traffic from a VM that loaded virtio-pci drivers is a doable solution (since you can't predict what the owners of the VMs will be doing), also, specially because in my env, the DPDK App is a L2 bridge, so, it receives traffic that is not destined to it (might be even harder to try to do this)...

 

 I have all the required hardware to keep testing this, so, let me know when you guys (Intel / Canonical) have newer versions, I'll test it with pleasure!   :-)

[BHANU] Apply the patch from the thread and this should resolve the issue reported.


I'll definitely give it a try! Thank you for sharing that... I'll post the results soon.
 

 

Thanks!

Thiago

 

On 25 May 2016 at 11:00, Bodireddy, Bhanuprakash <address@hidden> wrote:

I could reproduce the issue and this can be fixed as below

Firstly, the throughput issues observed with other VMs when a new VM is started can be fixed using the patch in the thread http://openvswitch.org/pipermail/dev/2016-May/071615.html.  I have put up an explanation in this thread for the cause of issue especially with multi VM setup on OVS DPDK.

On a Multi VM setup even with the above patch applied, one might see aggregate throughput difference when vNIC is bind to igb_uio vs virtio-pci, this is for the fact that the interrupt overhead is significantly higher when virtio-pci is in use.

More importantly if you have setup explicit flows matching VM's MAC/IP, disabling the flows to the VM that are idle would improve the aggregate throughput and lessen the burden on the pmd thread.   'watch -d ./utilities/ovs-appctl dpctl/show -s' will show no. of packet stats.

Regards,
Bhanu Prakash.


>-----Original Message-----
>From: dev [mailto:address@hidden] On Behalf Of Christian
>Ehrhardt
>Sent: Wednesday, May 25, 2016 7:08 AM
>To: Martinx - ジェームズ <address@hidden>
>Cc: <address@hidden> <address@hidden>; dev <address@hidden>;
>address@hidden
>Subject: Re: [ovs-dev] If 1 KVM Guest loads the virtio-pci, on top of
>dpdkvhostuser OVS socket interface, it slows down everything!
>

>Hi again,
>another forgotten case.
>
>I currently I lack the HW to fully reproduce this, but the video summary is
>pretty good and shows the issue in an impressive way.
>
>Also the description is good and here as well I wonder if anybody else could
>reproduce this.
>Any hints / insights are welcome.
>
>P.S. and also again - two list cross posting, but here as well it is yet unclear
>which it belongs to so I'll keep it as well
>
>Christian Ehrhardt
>Software Engineer, Ubuntu Server
>Canonical Ltd
>
>On Sun, May 22, 2016 at 6:35 PM, Martinx - ジェームズ
><address@hidden>
>wrote:
>
>> Guys,
>>
>>  I'm seeing a strange problem here, in my OVS+DPDK deployment, on top
>> of Ubuntu 16.04 (DPDK 2.2 and OVS 2.5).
>>
>>  Here is what I'm trying to do: run OVS with DPDK at the host, for KVM
>> Guests that also, will be running more DPDK Apps.
>>
>>  The host have 2 x 10G NICs, for OVS+DPDK and each KVM Guest receives
>> its own VLAN tagged traffic (or all tags).
>>
>>  There is an IXIA Traffic Generator sending 10G of traffic on both
>> directions (20G total).
>>
>>  Exemplifying, the problem is, lets say that I already have 2 VMs (or
>> 10) running DPDK Apps (on top of dpdkvhostuser), everything is working
>> as expected, then, if I boot the 3rd (or 11) KVM Guest, the OVS+DPDK
>> bridge at the host, slows down, a lot! The 3rd (or 11) VM affects not
>> only the host, but also, all the other neighbors VMs!!!
>>
>>  NOTE: This problem appear since the boot of VM 1.
>>
>>  Soon as you, inside of the 3rd VM, bind the VirtIO NIC to the
>> DPDK-Compative Drivers, the speed comes back to normal. If you bind it
>> back to "virtio-pci", boom! The OVS+DPDK at the host and all VMs loses
>> too much speed.
>>
>>  This problem is detailed at the following bug report:
>>
>> --
>> The OVS+DPDK dpdkvhostuser socket bridge, only works as expected, if
>> the KVM Guest also have DPDK drivers loaded:
>>
>> https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1577256
>> --
>>
>>  Also, I've recorded a ~15 min screen cast video about this problem,
>> so, you guys can see exactly what is happening here.
>>
>>
>https://www.youtube.com/v/yHnaSikd9XY?version=3&vq=hd720&autoplay=
>1
>>
>>  * At 5:25, I'm starting a VM that will boot up and load a DPDK App;
>>
>>  * At 5:33, OVS+DPDK is messed up, it loses speed;
>>    The KVM running with virtio-pci drivers breaks OVS+DPDK at the
>> host;
>>
>>  * At 6:50, DPDK inside of the KVM guest loads up its drivers, kicking
>> "virtio-pci", speed back to normal at the host;
>>
>>  * At 7:43, started another KVM Guest, now, while virtio-pci driver is
>> running, the OVS+DPDK at the host and the other VM, are very, very
>> slow;
>>
>>  * At 8:52, the second VM loads up DPDK Drivers, kicking virtio-pci,
>> the speed is back to normal at the host, and on the other VM too;
>>
>>  * At 10:00, the Ubuntu VM loads up virtio-pci drivers on its boot,
>> the speed dropped at the hosts and on the other VMs;
>>
>>  * 11:57, I'm starting "service dpdk start" inside of the Ubuntu
>> guest, to kick up virtio-pci, and bang! Speed is back to normal
>> everywhere;
>>
>>  * 12:51, I'm trying to unbind the DPDK Drivers and return the
>> virtio-pci, I forgot the syntax while recording the video, which is:
>> "dpdk_nic_bind -b <ID> virtio-pci", so, I just rebooted it. But both
>> "reboot" or "rebind to virtio-pci" triggers the bug.
>>
>>
>> NOTE: I tried to subscriber to qemu-devel but, it is not working, I'm
>> not receiving the confirmation e-mail, while qemu-stable worked. I
>> don't know if it worth sending it to Linux Kernel too...
>>
>>
>> Regards,
>> Thiago
>>

>_______________________________________________
>dev mailing list
>address@hidden
>http://openvswitch.org/mailman/listinfo/dev

 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]