[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueu
From: |
Ming Lei |
Subject: |
[Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support |
Date: |
Tue, 5 Aug 2014 11:33:01 +0800 |
Hi,
These patches bring up below 4 changes:
- introduce object allocation pool and apply it to
virtio-blk dataplane for improving its performance
- introduce selective coroutine bypass mechanism
for improving performance of virtio-blk dataplane with
raw format image
- linux-aio changes: fixing for cases of -EAGAIN and partial
completion, increase max events to 256, and remove one unuseful
fields in 'struct qemu_laiocb'
- support multi virtqueue for virtio-blk
The virtio-blk multi virtqueue feature will be added to virtio spec 1.1[1],
and the 3.17 linux kernel[2] will support the feature in virtio-blk driver.
For those who wants to play the stuff, the kernel side patche can be found
in either Jens's block tree[3] or linux-next[4].
Below fio script running from VM is used for test improvement of these patches:
[global]
direct=1
size=128G
bsrange=4k-4k
timeout=120
numjobs=${JOBS}
ioengine=libaio
iodepth=64
filename=/dev/vdc
group_reporting=1
[f]
rw=randread
One quadcore VM(8G RAM) is created in below host to run above fio test:
- server(16cores: 8 physical cores, 2 threads per physical core)
Follows the test result on throughput improvement(IOPS) with
this patchset(4 virtqueues per virito-blk device, 4JOBS) against
QEMU 2.1.0: 53% throughput improvement can be observed, and
scalability for parallel I/Os is improved more(>100% throughput
improvement is observed in case of 4 JOBS).
>From above result, we can see both scalability and performance
get improved a lot.
After commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O), average time for submiting one single
request has been increased a lot, as my trace, the average
time taken for submiting one request has been doubled even
though block plug&unplug mechanism is introduced to
ease its effect. That is why this patchset introduces
selective coroutine bypass mechanism and object allocation
pool for saving the time first. Based on QEMU 2.0, only
single virtio-blk dataplane multi virtqueue patch can get
better improvement than current result[5].
V1:
- bypass co: add check for making bypass decision to help
remove hint from device in future
- bypass co: run acb->cb() via BH as pointed by Paolo and Stefan
- virtio: remove patch for decreasing size of VirtQueueElement,
which will break migration between different QEMU version,
another standalone patchset might do that
- linux-aio: retry io_submit in following completion cb for -EAGAIN
as suggested by Paolo
- linux-aio: handle -EAGAIN for non plugged case as suggested by Paolo
- mq conversion: support multi virtqueue for non-dataplane as required
by Paolo
TODO:
- optimize block layer for linux aio, so that
more time can be saved for submitting request
- support more than one aio-context for improving
virtio-blk performance
[1], http://marc.info/?l=linux-api&m=140486843317107&w=2
[2], http://marc.info/?l=linux-api&m=140418368421229&w=2
[3], http://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git/
#for-3.17/drivers
[4], https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/
[5], http://marc.info/?l=linux-api&m=140377573830230&w=2
block.c | 233 ++++++++++++++++++++++++++++++++++-----
block/linux-aio.c | 124 ++++++++++++++++-----
block/raw-posix.c | 34 ++++++
hw/block/dataplane/virtio-blk.c | 221 ++++++++++++++++++++++++++++---------
hw/block/virtio-blk.c | 39 +++++--
include/block/block.h | 12 ++
include/block/block_int.h | 3 +
include/block/coroutine.h | 8 ++
include/block/coroutine_int.h | 5 +
include/hw/virtio/virtio-blk.h | 14 ++-
include/qemu/gc.h | 56 ++++++++++
include/qemu/obj_pool.h | 64 +++++++++++
qemu-coroutine-lock.c | 4 +-
qemu-coroutine.c | 33 ++++++
14 files changed, 734 insertions(+), 116 deletions(-)
Thanks,
- [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support,
Ming Lei <=
[Qemu-devel] [PATCH v1 02/17] dataplane: use object pool to speed up allocation for virtio blk request, Ming Lei, 2014/08/04
[Qemu-devel] [PATCH v1 03/17] qemu coroutine: support bypass mode, Ming Lei, 2014/08/04