qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/16] aio_context_acquire/release pushdown, par


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 00/16] aio_context_acquire/release pushdown, part 2
Date: Wed, 18 Jan 2017 17:31:03 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1


On 18/01/2017 17:03, Stefan Hajnoczi wrote:
> On Fri, Jan 13, 2017 at 02:17:15PM +0100, Paolo Bonzini wrote:
>> This series pushes down aio_context_acquire/release to the point
>> where we can actually reason on using different fine-grained mutexes.
>>
>> The main infrastructure is introduced in patch 1.  The new API aio_co_wake
>> starts a coroutine with aio_context_acquire/release protection, which
>> requires tracking each coroutine's "home" AioContext.  aio_co_schedule
>> instead takes care of moving a sleeping coroutine to a different
>> AioContext, also ensuring that it runs under aio_context_acquire/release.
>> This is useful to implement bdrv_set_aio_context, as a simpler alternative
>> to bottom halves.  Even though one-shot BHs are already simpler than
>> what we had before, after this patch aio_co_wake and aio_co_schedule
>> save you from having to do aio_context_acquire/release explicitly.
>>
>> After patch 2 and 3, which are just small preparatory changes, patches
>> 4 to 7 provide an example of how to use the new API.  In particular patch
>> 4 to 6 implement a new organization of coroutines in the NBD client,
>> which allows not blocking on partial reply header reads.
>>
>> Patch 8 introduces helpers for AioContext locking in QED, which is
>> the most complex AIO-based driver left.  Then the actual meat of the
>> series runs from patch 9 to patch 13, followed by small optimizations
>> in patches 14 and 15.
>>
>> The patches do some back and forth in adding/removing
>> aio_context_acquire/release calls in block/*.c but ultimately a small
>> number of aio_context_acquire/release pairs are added after the pushdown.
>> These are mostly in drivers that use external libraries (where they
>> actually could already be replaced by QemuMutex) and in device models.
>>
>> Notably, coroutines need not care about aio_context_acquire/release.
>> The device models ensure that the first creation of the coroutine has
>> the AioContext, while aio_co_wake/aio_co_schedule do the same after
>> they yield.  Therefore, most of the files only need to use those two
>> functions instead of, respectively, qemu_coroutine_enter and
>> aio_bh_schedule_oneshot.
>>
>> However, this is only an intermediate step which is needed because the
>> block layer and qemu-coroutine locks are thread-unsafe.  So the next
>> part will add separate locking, independent of AioContext, to block.c and
>> mostly block/io.c---this includes making CoMutex thread-safe.  Patch 16
>> therefore already documents the current locking policies block.h to
>> prepare for the next series.
>>
>> Paolo
>>
>> Paolo Bonzini (16):
>>   aio: introduce aio_co_schedule and aio_co_wake
>>   block-backend: allow blk_prw from coroutine context
>>   test-thread-pool: use generic AioContext infrastructure
>>   io: add methods to set I/O handlers on AioContext
>>   io: make qio_channel_yield aware of AioContexts
>>   nbd: do not block on partial reply header reads
>>   coroutine-lock: reschedule coroutine on the AioContext it was running
>>     on
>>   qed: introduce qed_aio_start_io and qed_aio_next_io_cb
>>   aio: push aio_context_acquire/release down to dispatching
>>   block: explicitly acquire aiocontext in timers that need it
>>   block: explicitly acquire aiocontext in callbacks that need it
>>   block: explicitly acquire aiocontext in bottom halves that need it
>>   block: explicitly acquire aiocontext in aio callbacks that need it
>>   aio-posix: partially inline aio_dispatch into aio_poll
>>   async: remove unnecessary inc/dec pairs
>>   block: document fields protected by AioContext lock
>>
>>  aio-posix.c                    |  60 +++---------
>>  aio-win32.c                    |  30 ++----
>>  async.c                        |  81 ++++++++++++++--
>>  block/blkdebug.c               |   9 +-
>>  block/blkreplay.c              |   2 +-
>>  block/block-backend.c          |  13 ++-
>>  block/curl.c                   |  44 ++++++---
>>  block/gluster.c                |   9 +-
>>  block/io.c                     |   4 +-
>>  block/iscsi.c                  |  15 ++-
>>  block/linux-aio.c              |  10 +-
>>  block/mirror.c                 |  12 ++-
>>  block/nbd-client.c             | 108 ++++++++-------------
>>  block/nbd-client.h             |   2 +-
>>  block/nfs.c                    |   9 +-
>>  block/qed-cluster.c            |   2 +
>>  block/qed-table.c              |  12 ++-
>>  block/qed.c                    |  58 +++++++----
>>  block/qed.h                    |   3 +
>>  block/sheepdog.c               |  29 +++---
>>  block/ssh.c                    |  29 ++----
>>  block/throttle-groups.c        |   2 +
>>  block/win32-aio.c              |   9 +-
>>  dma-helpers.c                  |   2 +
>>  hw/block/virtio-blk.c          |  19 +++-
>>  hw/scsi/scsi-bus.c             |   2 +
>>  hw/scsi/scsi-disk.c            |  15 +++
>>  hw/scsi/scsi-generic.c         |  20 +++-
>>  hw/scsi/virtio-scsi.c          |   6 ++
>>  include/block/aio.h            |  38 +++++++-
>>  include/block/block_int.h      |  64 ++++++++-----
>>  include/io/channel.h           |  59 +++++++++++-
>>  include/qemu/coroutine_int.h   |  10 +-
>>  include/sysemu/block-backend.h |  14 ++-
>>  io/channel-command.c           |  13 +++
>>  io/channel-file.c              |  11 +++
>>  io/channel-socket.c            |  16 +++-
>>  io/channel-tls.c               |  12 +++
>>  io/channel-watch.c             |   6 ++
>>  io/channel.c                   |  97 +++++++++++++++----
>>  nbd/client.c                   |   2 +-
>>  nbd/common.c                   |   9 +-
>>  nbd/server.c                   |   4 +
>>  tests/Makefile.include         |  15 ++-
>>  tests/iothread.c               |  91 ++++++++++++++++++
>>  tests/iothread.h               |  25 +++++
>>  tests/test-aio-multithread.c   | 213 
>> +++++++++++++++++++++++++++++++++++++++++
>>  tests/test-thread-pool.c       |  12 +--
>>  tests/test-vmstate.c           |  11 ---
>>  thread-pool.c                  |   6 +-
>>  trace-events                   |   4 +
>>  util/qemu-coroutine-lock.c     |   5 +-
>>  util/qemu-coroutine-sleep.c    |   2 +-
>>  util/qemu-coroutine.c          |   8 ++
>>  util/trace-events              |   1 -
>>  55 files changed, 1012 insertions(+), 352 deletions(-)
>>  create mode 100644 tests/iothread.c
>>  create mode 100644 tests/iothread.h
>>  create mode 100644 tests/test-aio-multithread.c
> 
> This is a big and somewhat risky change.  Have you run any performance
> benchmarks?

Not recently; I ran them a year ago and there was no measurable difference.

However, this is also an intermediate state; the design is such that in
the end the performance-critical case (virtio-blk + linux-aio) will run
without taking any mutex at all(*).  It will only use thread-local data,
or simple BDS statistics that can be updated with atomic_add or
atomic_cmpxchg.  Fam's NVMe driver could also be extended to use
multiple hardware queues, one per QEMU thread, and avoid synchronization
completely.

Paolo

(*) Well, almost.  tracked_request_begin/tracked_request_end still need
    protection in the multiqueue case.  But they can use a spinlock
    since the critical section is very short; the expensive
    CoMutex/CoQueue synchronization only happens for copy-on-read and
    misaligned writes).

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]