qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/14] dataplane: optimization and multi virtque


From: Christian Borntraeger
Subject: Re: [Qemu-devel] [PATCH 00/14] dataplane: optimization and multi virtqueue support
Date: Wed, 30 Jul 2014 14:42:10 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.0

On 30/07/14 13:39, Ming Lei wrote:
> These patches bring up below 4 changes:
> 
>         - introduce selective coroutine bypass mechanism
>         for improving performance of virtio-blk dataplane with
>         raw format image
> 
>         - introduce object allocation pool and apply it to
>         virtio-blk dataplane for improving its performance
> 
>         - linux-aio changes: fixing for cases of -EAGAIN and partial
>         completion, increase max events to 256, and remove one unuseful
>         fields in 'struct qemu_laiocb'
> 
>         - support multi virtqueue for virtio-blk dataplane
> 
> The virtio-blk multi virtqueue feature will be added to virtio spec 1.1[1],
> and the 3.17 linux kernel[2] will support the feature in virtio-blk driver.
> For those who wants to play the stuff, the kernel side patche can be found
> in either Jens's block tree[3] or linux-next[4].
> 
> Below fio script running from VM is used for test improvement of these 
> patches:
> 
>         [global]
>         direct=1
>         size=128G
>         bsrange=4k-4k
>         timeout=120
>         numjobs=${JOBS}
>         ioengine=libaio
>         iodepth=64
>         filename=/dev/vdc
>         group_reporting=1
> 
>         [f]
>         rw=randread
> 
> One quadcore VM(8G RAM) is created in below host to run above fio test:
> 
>         - server(16cores: 8 physical cores, 2 threads per physical core)
> 
> Follows the test result on throughput improvement(IOPS) with
> this patchset(4 virtqueues per virito-blk device) against QEMU
> 2.1.0-rc5: 30% throughput improvement can be observed, and
> scalability for parallel I/Os is improved more(80% throughput
> improvement is observed in case of 4 JOBS).
> 
> From above result, we can see both scalability and performance
> get improved a lot.
> 
> After commit 580b6b2aa2(dataplane: use the QEMU block
> layer for I/O), average time for submiting one single
> request has been increased a lot, as my trace, the average
> time taken for submiting one request has been doubled even
> though block plug&unplug mechanism is introduced to
> ease its effect. That is why this patchset introduces
> selective coroutine bypass mechanism and object allocation
> pool for saving the time first. Based on QEMU 2.0, only
> single virtio-blk dataplane multi virtqueue patch can get
> better improvement than current result[5].
> 
> TODO:
>         - optimize block layer for linux aio, so that
>         more time can be saved for submitting request
>         - support more than one aio-context for improving
>         virtio-blk performance
[...]
> 
> [1], http://marc.info/?l=linux-api&m=140486843317107&w=2
> [2], http://marc.info/?l=linux-api&m=140418368421229&w=2
> [3], http://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git/ 
> #for-3.17/drivers
> [4], https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/
> [5], http://marc.info/?l=linux-api&m=140377573830230&w=2

FYI, I just tested with one virtqueue on s390 (3.15 as guest). It was just a 
quick sniff, but we are getting closer to the fio results that we had before 
commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O). I cant give proper numbers right now, as I am on a shared 
storage subsystem but this looks like we are on the right track. I have not 
looked at the code, though.

Christian




reply via email to

[Prev in Thread] Current Thread [Next in Thread]