[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 0/3] linux-aio: reduce completion latency
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH v2 0/3] linux-aio: reduce completion latency |
Date: |
Wed, 20 Jul 2016 10:08:11 +0100 |
User-agent: |
Mutt/1.6.1 (2016-04-27) |
On Tue, Jul 19, 2016 at 02:27:40PM +0200, Roman Pen wrote:
> v2:
> o For the third patch do not introduce extra member for LinuxAioState
> structure, reuse ret == -EINPROGRESS.
>
> o Add explicit comment which explains why we do not hang if requests
> are still pended.
>
>
> This series are intended to reduce completion latencies by two changes:
>
> 1. QEMU does not use any timeout value for harvesting completed AIO
> requests from the ring buffer, thus io_getevents() can be implemented
> in userspace (first patch).
>
> 2. In order to reduce completion latency it makes sense to harvest completed
> requests ASAP. Very fast backend device can complete requests just after
> submission, so it is worth trying to check ring buffer and peek completed
> requests directly after io_submit() has been called (third patch).
>
> Indeed, the series reduces the completions latencies and increases the
> overall throughput, e.g. the following is the percentiles of number of
> completed requests at once:
>
> 1th 10th 20th 30th 40th 50th 60th 70th 80th 90th 99.99th
> Before 2 4 42 112 128 128 128 128 128 128 128
> After 1 1 4 14 33 45 47 48 50 51 108
>
> That means, that before the third patch is applied the ring buffer is
> observed as full (128 requests were consumed at once) in 60% of calls.
>
> After the third patch is applied the distribution of number of completed
> requests is "smoother" and the queue (requests in-flight) is almost never
> full.
>
> The fio read results are the following (write results are almost the
> same and are not showed here):
>
> Before
> ------
> job: (groupid=0, jobs=8): err= 0: pid=2227: Tue Jul 19 11:29:50 2016
> Description : [Emulation of Storage Server Access Pattern]
> read : io=54681MB, bw=1822.7MB/s, iops=179779, runt= 30001msec
> slat (usec): min=172, max=16883, avg=338.35, stdev=109.66
> clat (usec): min=1, max=21977, avg=1051.45, stdev=299.29
> lat (usec): min=317, max=22521, avg=1389.83, stdev=300.73
> clat percentiles (usec):
> | 1.00th=[ 346], 5.00th=[ 596], 10.00th=[ 708], 20.00th=[ 852],
> | 30.00th=[ 932], 40.00th=[ 996], 50.00th=[ 1048], 60.00th=[ 1112],
> | 70.00th=[ 1176], 80.00th=[ 1256], 90.00th=[ 1384], 95.00th=[ 1496],
> | 99.00th=[ 1800], 99.50th=[ 1928], 99.90th=[ 2320], 99.95th=[ 2672],
> | 99.99th=[ 4704]
> bw (KB /s): min=205229, max=553181, per=12.50%, avg=233278.26,
> stdev=18383.51
>
> After
> ------
> job: (groupid=0, jobs=8): err= 0: pid=2220: Tue Jul 19 11:31:51 2016
> Description : [Emulation of Storage Server Access Pattern]
> read : io=57637MB, bw=1921.2MB/s, iops=189529, runt= 30002msec
> slat (usec): min=169, max=20636, avg=329.61, stdev=124.18
> clat (usec): min=2, max=19592, avg=988.78, stdev=251.04
> lat (usec): min=381, max=21067, avg=1318.42, stdev=243.58
> clat percentiles (usec):
> | 1.00th=[ 310], 5.00th=[ 580], 10.00th=[ 748], 20.00th=[ 876],
> | 30.00th=[ 908], 40.00th=[ 948], 50.00th=[ 1012], 60.00th=[ 1064],
> | 70.00th=[ 1080], 80.00th=[ 1128], 90.00th=[ 1224], 95.00th=[ 1288],
> | 99.00th=[ 1496], 99.50th=[ 1608], 99.90th=[ 1960], 99.95th=[ 2256],
> | 99.99th=[ 5408]
> bw (KB /s): min=212149, max=390160, per=12.49%, avg=245746.04,
> stdev=11606.75
>
> Throughput increased from 1822MB/s to 1921MB/s, average completion latencies
> decreased from 1051us to 988us.
>
> Roman Pen (3):
> linux-aio: consume events in userspace instead of calling io_getevents
> linux-aio: split processing events function
> linux-aio: process completions from ioq_submit()
>
> block/linux-aio.c | 178
> ++++++++++++++++++++++++++++++++++++++++++------------
> 1 file changed, 141 insertions(+), 37 deletions(-)
>
> Signed-off-by: Roman Pen <address@hidden>
> Cc: Stefan Hajnoczi <address@hidden>
> Cc: Paolo Bonzini <address@hidden>
> Cc: address@hidden
Thanks, applied to my block-next tree for QEMU 2.8:
https://github.com/stefanha/qemu/commits/block-next
Stefan
signature.asc
Description: PGP signature