qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] high outage times for qemu virtio network links during


From: Chris Friesen
Subject: Re: [Qemu-devel] high outage times for qemu virtio network links during live migration, trying to debug
Date: Tue, 26 Jan 2016 11:21:22 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

On 01/26/2016 10:50 AM, Paolo Bonzini wrote:


On 26/01/2016 17:41, Chris Friesen wrote:
I'm using libvirt (1.2.12) with qemu (2.2.0) in the context of OpenStack.

If I live-migrate a guest with virtio network interfaces, I see a
~1200msec delay in processing the network packets, and several hundred
of them get dropped.  I get the dropped packets, but I'm not sure why
the delay is there.

I instrumented qemu and libvirt, and the strange thing is that this
delay seems to happen before qemu actually starts doing any
migration-related work.  (i.e. before qmp_migrate() is called)

Looking at my timestamps, the start of the glitch seems to coincide with
libvirtd calling qemuDomainMigratePrepareTunnel3Params(), and the end of
the glitch occurs when the migration is complete and we're up and
running on the destination.

My question is, why doesn't qemu continue processing virtio packets
while the dirty page scanning and memory transfer over the network is
proceeding?

QEMU (or vhost) _are_ processing virtio traffic, because otherwise you'd
have no delay---only dropped packets.  Or am I missing something?

I have separate timestamps embedded in the packet for when it was sent and when it was echoed back by the target (which is the one being migrated). What I'm seeing is that packets to the guest are being sent every msec, but they get delayed somewhere for over a second on the way to the destination VM while the migration is in progress. Once the migration is over, a bunch of packets get delivered to the app in the guest and are then processed all at once and echoed back to the sender in a big burst (and a bunch of packets are dropped, presumably due to a buffer overflowing somewhere).

For comparison, we have a DPDK-based fastpath NIC type that we added (sort of like vhost-net), and it continues to process packets while the dirty page scanning is going on. Only the actual cutover affects it.

Chris




reply via email to

[Prev in Thread] Current Thread [Next in Thread]