[Qemu-devel] [PATCH v1 0/6] A migration performance testing framework

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH v1 0/6] A migration performance testing framework

From:	Daniel P. Berrange
Subject:	[Qemu-devel] [PATCH v1 0/6] A migration performance testing framework
Date:	Thu, 5 May 2016 15:27:54 +0100

This series of patches provides a framework for testing migration performance
characteristics. The motivating factor for this is planning that is underway
in OpenStack wrt making use of QEMU migration features such as compression,
auto-converge and post-copy. The primary aim for OpenStack is to have Nova
autonomously manage migration features & tunables to maximise chances that
migration will complete. The problem faced is figuring out just which QEMU
migration features are "best" suited to our needs. This means we want data
on how well they are able to ensure completion of a migration, against the
host resources used and the impact on the guest workload performance.

The test framework produced here takes a pathelogical guest workload (every
CPU just burning 100% of time xor'ing every byte of guest memory with random
data). This is quite a pessimistic test because most guest workloads are not
giong to be this heavy on memory writes, and their data won't be uniformly
random and so will be able to compress better than this test does.

With this worst case guest, I have produced a set of tests using UNIX socket,
TCP localhost, TCP remote and RDMA remote socket transports, with both a
1 GB RAM + 1 CPU guest and a 8 GB RAM + 4 CPU guest.

The TCP/RDMA remote host tests were run over a 10-GiG-E network interface.

I have put the results online to view here:

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/

The charts here are showing two core sets of data:

- The guest CPU performance. The left axis is showing the time in milliseconds
required to xor 1 GB of memory. This is shown per-guest CPU and combined all
CPUs.

- The host CPU utilization. The right axis is showing the overall QEMU process
CPU utilization, and the per-VCPU utilization.

Note that the charts are interactive - you can turn on/off each plot line and
zoom in by selecting regions on the chart.

Some interesting things that I have observed with this

- At the start of each iteration of migration there is a distinct drop in
guest CPU performance as shown by a spike in the guest CPU time lines.
Performance would drop from 200ms/GB to 400ms/GB. Presumably this is
related to QEMU recalculating the dirty bitmap for the guest RAM. See
the spikes in the green line in:

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/post-copy-bandwidth/post-copy-bw-1gbs.html

- For the larger sized guests, the auto-converge code has to throttle the
guest to as much as 90% or more before it is able to meet the 500ms max
downtime value

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/auto-converge-bandwidth/auto-converge-bw-1gbs.html

Even then I often saw tests aborting as they hit the max number of
iterations I permitted (30 iters max)

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-8gb-4cpu/auto-converge-bandwidth/auto-converge-bw-10gbs.html

- MT compression is actively harmful to chances of successful migration when
the guest RAM is not compression friendly. My work load is worst case since
it is splattering RAM with totally random bytes. The MT compression is
dramatically increasing the time for each iteration as we bottleneck on CPU
compression speed, leaving the network largely idle. This causes migration
which would have completed without compression, to fail. It also burns huge
amounts of host CPU time

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/compr-mt/compr-mt-threads-4.html

- XBZRLE compression did not have as much of a CPU peformance penalty on the
host as MT comprssion, but also did not help migration to actually complete.
Again this is largely due to the workload being the worst case scenario with
random bytes. The downside is obviously the potentially significant memory
overhead on the host due to the cache sizing

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-1gb-1cpu/compr-xbzrle/compr-xbzrle-cache-50.html

- Post-copy, by its very nature, obviously ensured that the migraton would
complete. While post-copy was running in pre-copy mode there was a somewhat
chaotic small impact on guest CPU performance, causing performance to
periodically oscillate between 400ms/GB and 800ms/GB. This is less than
the impact at the start of each migration iteration which was 1000ms/GB
in this test. There was also a massive penalty at time of switchover from
pre to post copy, as to be expected. The migration completed in post-copy
phase quite quickly though. For this workload, number of iterations in
pre-copy mode before switching to post-copy did not have much impact. I
expect a less extreme workload would have shown more interesting results
wrt number of iterations of pre-copy:

https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-remote-8gb-4cpu/post-copy-iters.html

Overall, if we're looking for a solution that can guarantee completion under
the most extreme guest workload, then only post-copy & autoconverge appear
upto the job.

The MT compression is seriously harmful to migration and has severe CPU
overhead. The XBZRLE compression is moderatly harmful to migration and has
potentilly severa memory overhead for large cache sizes to make it useful.

While auto-converge can ensure that guest migration completes, it has a
pretty significantly long term impact on guest CPU performance to achieve
this. ie the guest spends a long time in pre-copy mode with its CPUs very
dramatically throttled down. The level of throttling required makes one
wonder whether it is worth using, against simply pausing the guest workload.
The latter has a hard blackout period, but over a quite short time frame
if network speed is fast.

The post-copy code does have an impact on guest performance while in pre
copy mode, vs a plain migration. It also has a fairly high spike when in
post-copy mode, but this last for a pretty short time. As compared to
auto-converge, it is able to ensure the migration completes in a finite
time without having a prolonged impact on guest CPU performance. The
penalty during the post-copy phase is on a par with the penalty impose
by auto-converge when it has to throttle to 90%+.

Overall, in the context of a worst case guest workload, it appears that
post-copy is the clear winning strategy ensuring completion of migration
without imposing an long duration penalty on guest peformance. If the
risk of failure from post-copy is unacceptable then auto-converge is a
good fallback option, if the long duration guest CPU penalty can be
accepted.

The compression options are only worth using if the host has free CPU
resources, and the guest RAM is believed to be compression friendly,
as they steal significant CPU time away from guests in order to run
compression, often with a negative impact on migration completion
chances.

Looking at migration in general, even with a 10-GiG-E NIC and RDMA
transport it is possible for a single guest to provide a workload that
will saturate the network during migration & thus prevent completion.
Based on this, there is little point in attempting to run migrations
in parallel on the same host, unless multiple NICs are available,
as parallel migrations would reduce the chances of either one ever
completing. Better reliability & faster overall completion would
likely be achieved by fully serializing migration operations per
host.

There is clearly scope for more investigation here, in particular

- Produce some alternative guest workloads that try to present
a more "average" scenario workload, instead of the worst-case.
These would likely allow compression to have some positive
impact.

- Try various combinations of strategies. For example, combining
post-copy and auto-converge at the same time, or compression
combined with either post-copy or auto-converge.

- Investigate block migration performance too, with NBD migration
server.

- Investigate effect of dynamically changing max downtime value
during migration, rather than using a fixed 500ms value.

Daniel P. Berrange (6):
scripts: add __init__.py file to scripts/qmp/
scripts: add a 'debug' parameter to QEMUMonitorProtocol
scripts: refactor the VM class in iotests for reuse
scripts: set timeout when waiting for qemu monitor connection
scripts: ensure monitor socket has SO_REUSEADDR set
tests: introduce a framework for testing migration performance

configure | 2 +
scripts/qemu.py | 202 +++++++++++
scripts/qmp/__init__.py | 0
scripts/qmp/qmp.py | 15 +-
scripts/qtest.py | 34 ++
tests/Makefile | 12 +
tests/migration/.gitignore | 2 +
tests/migration/guestperf-batch.py | 26 ++
tests/migration/guestperf-plot.py | 26 ++
tests/migration/guestperf.py | 27 ++
tests/migration/guestperf/__init__.py | 0
tests/migration/guestperf/comparison.py | 124 +++++++
tests/migration/guestperf/engine.py | 439 ++++++++++++++++++++++
tests/migration/guestperf/hardware.py | 62 ++++
tests/migration/guestperf/plot.py | 623 ++++++++++++++++++++++++++++++++
tests/migration/guestperf/progress.py | 117 ++++++
tests/migration/guestperf/report.py | 98 +++++
tests/migration/guestperf/scenario.py | 95 +++++
tests/migration/guestperf/shell.py | 255 +++++++++++++
tests/migration/guestperf/timings.py | 55 +++
tests/migration/stress.c | 367 +++++++++++++++++++
tests/qemu-iotests/iotests.py | 135 +------
22 files changed, 2583 insertions(+), 133 deletions(-)
create mode 100644 scripts/qemu.py
create mode 100644 scripts/qmp/__init__.py
create mode 100644 tests/migration/.gitignore
create mode 100755 tests/migration/guestperf-batch.py
create mode 100755 tests/migration/guestperf-plot.py
create mode 100755 tests/migration/guestperf.py
create mode 100644 tests/migration/guestperf/__init__.py
create mode 100644 tests/migration/guestperf/comparison.py
create mode 100644 tests/migration/guestperf/engine.py
create mode 100644 tests/migration/guestperf/hardware.py
create mode 100644 tests/migration/guestperf/plot.py
create mode 100644 tests/migration/guestperf/progress.py
create mode 100644 tests/migration/guestperf/report.py
create mode 100644 tests/migration/guestperf/scenario.py
create mode 100644 tests/migration/guestperf/shell.py
create mode 100644 tests/migration/guestperf/timings.py
create mode 100644 tests/migration/stress.c

--
2.5.5

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Daniel P. Berrange <=
- [Qemu-devel] [PATCH v1 1/6] scripts: add __init__.py file to scripts/qmp/, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 2/6] scripts: add a 'debug' parameter to QEMUMonitorProtocol, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 3/6] scripts: refactor the VM class in iotests for reuse, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 4/6] scripts: set timeout when waiting for qemu monitor connection, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 5/6] scripts: ensure monitor socket has SO_REUSEADDR set, Daniel P. Berrange, 2016/05/05
  - Re: [Qemu-devel] [PATCH v1 5/6] scripts: ensure monitor socket has SO_REUSEADDR set, Amit Shah, 2016/05/23
- [Qemu-devel] [PATCH v1 6/6] tests: introduce a framework for testing migration performance, Daniel P. Berrange, 2016/05/05
- Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Dr. David Alan Gilbert, 2016/05/05
  - Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Daniel P. Berrange, 2016/05/09
    - Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Dr. David Alan Gilbert, 2016/05/09

Prev by Date: [Qemu-devel] [PATCH v1 2/6] scripts: add a 'debug' parameter to QEMUMonitorProtocol
Next by Date: [Qemu-devel] [PATCH v1 3/6] scripts: refactor the VM class in iotests for reuse
Previous by thread: Re: [Qemu-devel] [PATCH v2 2/3] Add ENET/Gbps Ethernet support to FEC device
Next by thread: [Qemu-devel] [PATCH v1 1/6] scripts: add __init__.py file to scripts/qmp/
Index(es):
- Date
- Thread