qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/8] virtio-blk: multiqueue support


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v2 0/8] virtio-blk: multiqueue support
Date: Fri, 3 Jun 2016 15:26:27 -0700
User-agent: Mutt/1.6.1 (2016-04-27)

On Thu, Jun 02, 2016 at 05:19:41PM -0700, Stefan Hajnoczi wrote:
> On Mon, May 30, 2016 at 06:25:58PM -0700, Stefan Hajnoczi wrote:
> > v2:
> >  * Simplify s->rq live migration [Paolo]
> >  * Use more efficient bitmap ops for batch notification [Paolo]
> >  * Fix perf regression due to batch notify BH in wrong AioContext 
> > [Christian]
> > 
> > The virtio_blk guest driver has supported multiple virtqueues since Linux 
> > 3.17.
> > This patch series adds multiple virtqueues to QEMU's virtio-blk emulated
> > device.
> > 
> > Ming Lei sent patches previously but these were not merged.  This series
> > implements virtio-blk multiqueue for QEMU from scratch since the codebase 
> > has
> > changed.  Live migration support for s->rq was also missing from the 
> > previous
> > series and has been added.
> > 
> > It's important to note that QEMU's block layer does not support multiqueue 
> > yet.
> > Therefore virtio-blk device processes all virtqueues in the same AioContext
> > (IOThread).  Further work is necessary to take advantage of multiqueue 
> > support
> > in QEMU's block layer once it becomes available.
> > 
> > I will post performance results once they are ready.
> > 
> > Stefan Hajnoczi (8):
> >   virtio-blk: use batch notify in non-dataplane case
> >   virtio-blk: tell dataplane which vq to notify
> >   virtio-blk: associate request with a virtqueue
> >   virtio-blk: add VirtIOBlockConf->num_queues
> >   virtio-blk: multiqueue batch notify
> >   virtio-blk: live migrateion s->rq with multiqueue
> >   virtio-blk: dataplane multiqueue support
> >   virtio-blk: add num-queues device property
> > 
> >  hw/block/dataplane/virtio-blk.c |  68 +++++++++++----------
> >  hw/block/dataplane/virtio-blk.h |   2 +-
> >  hw/block/virtio-blk.c           | 129 
> > +++++++++++++++++++++++++++++++++++-----
> >  include/hw/virtio/virtio-blk.h  |   8 ++-
> >  4 files changed, 159 insertions(+), 48 deletions(-)
> 
> There is a significant performance regression due to batch notify:
> 
> $ ./analyze.py runs/
> Name                                   IOPS   Error
> unpatched-d6550e9ed2             19269820.2 ± 1.36%
> unpatched-d6550e9ed2-2           19567358.4 ± 2.42%
> v2-batch-only-f27ed9a4d9         16252227.2 ± 6.09%
> v2-no-dataplane                  14560225.4 ± 5.16%
> v2-no-dataplane-2                14622535.6 ± 10.08%
> v2-no-dataplane-3                13960670.8 ± 7.11%
> 
> unpatched-d6550e9ed2 is without this patch series.
> v2-batch-only-f27ed9a4d9 is with Patch 1 only.  v2-no-dataplane is with
> the patch series (dataplane is not enabled in any of these tests).
> 
> Next I will compare unpatched dataplane against patched dataplane.  I
> want to make sure Patch 1 faithfully moved batch notify from dataplane
> code to generic virtio-blk code without affecting performance.
> 
> If there is no difference then it means batch notify decreases
> performance for some workloads (obviously not the same workload that
> Ming Lei was running).

It turns out that Patch 1 slows down dataplane even though the code
looks equivalent.  After a lot of poking it turned out to be a subtle
issue:

The order of BHs in the AioContext->first_bh list affects performance.
Linux AIO (block/linux-aio.c) invokes completion callbacks from a BH.
Performance is much better if virtio-blk.c's batch BH is after the
completion BH.

The "fast" ordering notifies the guest in ~300 nanoseconds after the
last request completion.

The "slow" ordering sometimes takes 100 microseconds after the last
request completion before the guest is notified.  It probably depends on
whether the event loop is kicked by another source.

I'm thinking of scrapping the batch BH and instead using a notify
plug/unplug callback to suppress notification until the last request has
been processed.

I also checked that batch notification does indeed improve performance
compared to no batching.  It offers a nice boost so we do want to port
the feature from dataplane to non-dataplane.

For the time being: consider this patch series broken due to the
performance regression.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]