[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5 00/11] virtio: virtio-blk data plane
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [PATCH v5 00/11] virtio: virtio-blk data plane |
Date: |
Thu, 6 Dec 2012 13:38:28 +0200 |
On Wed, Dec 05, 2012 at 09:46:59PM +0100, Stefan Hajnoczi wrote:
> This series adds the -device virtio-blk-pci,x-data-plane=on property that
> enables a high performance I/O codepath. A dedicated thread is used to
> process
> virtio-blk requests outside the global mutex and without going through the
> QEMU
> block layer.
>
> Khoa Huynh <address@hidden> reported an increase from 140,000 IOPS to 600,000
> IOPS for a single VM using virtio-blk-data-plane in July:
>
> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
>
> The virtio-blk-data-plane approach was originally presented at Linux Plumbers
> Conference 2010. The following slides contain a brief overview:
>
>
> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf
>
> The basic approach is:
> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd
> signalling when the guest kicks the virtqueue.
> 2. Requests are processed without going through the QEMU block layer using
> Linux AIO directly.
> 3. Completion interrupts are injected via irqfd from the dedicated thread.
>
> To try it out:
>
> qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=...
> -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on
>
> Limitations:
> * Only format=raw is supported
> * Live migration is not supported
> * Block jobs, hot unplug, and other operations fail with -EBUSY
> * I/O throttling limits are ignored
> * Only Linux hosts are supported due to Linux AIO usage
>
> The code has reached a stage where I feel it is ready to merge. Users have
> been playing with it for some time and want the significant performance boost.
>
> We are refactoring QEMU to get rid of the global mutex. I believe that
> virtio-blk-data-plane can eventually become the default mode of operation.
>
> Instead of waiting for global mutex removal efforts to finish, I want to use
> virtio-blk-data-plane as an example device for AioContext and threaded hw
> dispatch refactoring. This means:
>
> 1. When the block layer can bind to an AioContext and execute I/O outside the
> global mutex, virtio-blk-data-plane can use this (and gain image format
> support).
>
> 2. When hw dispatch no longer needs the global mutex we can use hw/virtio.c
> again and perhaps run a pool of iothreads instead of dedicated data plane
> threads.
>
> But in the meantime, I have cleaned up the virtio-blk-data-plane code so that
> it can be merged as an experimental feature.
I mostly looked at the virtio side of the patchset.
I don't see any bugs here. I sent some improvement suggestions but
we can do them in tree as well.
> v5:
> * Omit memory regions with dirty logging enabled from hostmem [Michael]
> * Add doc comment about quiescing requests across memory hot unplug [Michael]
> * Clarify which Linux vhost version the vring code originates from [Michael]
> * Break up indirect vring buffer into 1 hostmem_lookup() per descriptor
> [Michael]
> * Barriers in hw/dataplane/vring.c to force fields to be loaded [Michael]
> * split vring_set_notification() into enable/disable [Paolo]
> * barriers in vring.c instead of virtio-blk.c [Michael]
> * move setup code from hw/virtio-blk.c into hw/dataplane/virtio-blk.c
> [Michael]
>
> * Note I did not get rid of the mutex+condvar approach to draining requests.
> I've had good feedback on the performance of the patch series so I'm not
> worried about eliminating the lock (it's very rarely contended). Hope
> Michael and Paolo are okay with this approach.
>
> v4:
> * Add qemu_iovec_concat_iov() [Paolo]
> * Use QEMUIOVector to copy out virtio_blk_inhdr [Michael, Paolo]
>
> v3:
> * Don't assume iovec layout [Michael]
> * Better naming for hostmem.c MemoryListener callbacks [Don]
> * More vring quarantining if commands are bogus instead of exiting [Blue]
>
> v2:
> * Use MemoryListener for thread-safe memory mapping [Paolo, Anthony, and
> everyone else pointed this out ;-)]
> * Quarantine invalid vring instead of exiting [Blue]
> * Replace __u16 kernel types with uint16_t [Blue]
>
> Changes from the RFC v9:
> * Add x-data-plane=on|off option and coexist with regular virtio-blk code
> * Create thread from BH so it inherits iothread cpusets
> * Drain requests on vm_stop() so stopped guest does not access image file
> * Add migration blocker
> * Add bdrv_in_use() to prevent block jobs and other operations that can
> interfere
> * Drop IOQueue request merging for simplicity
> * Drop ioctl interrupt injection and always use irqfd for simplicity
> * Major cleanup to split up source files
> * Rebase from qemu-kvm.git onto qemu.git
> * Address Michael Tsirkin's review comments
>
> Stefan Hajnoczi (11):
> raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane
> configure: add CONFIG_VIRTIO_BLK_DATA_PLANE
> dataplane: add host memory mapping code
> dataplane: add virtqueue vring code
> dataplane: add event loop
> dataplane: add Linux AIO request queue
> iov: add iov_discard() to remove data
> test-iov: add iov_discard() testcase
> iov: add qemu_iovec_concat_iov()
> dataplane: add virtio-blk data plane code
> virtio-blk: add x-data-plane=on|off performance feature
>
> block.h | 9 +
> block/raw-posix.c | 34 ++++
> configure | 21 ++
> hw/Makefile.objs | 2 +-
> hw/dataplane/Makefile.objs | 3 +
> hw/dataplane/event-poll.c | 109 +++++++++++
> hw/dataplane/event-poll.h | 40 ++++
> hw/dataplane/hostmem.c | 173 +++++++++++++++++
> hw/dataplane/hostmem.h | 57 ++++++
> hw/dataplane/ioq.c | 118 ++++++++++++
> hw/dataplane/ioq.h | 57 ++++++
> hw/dataplane/virtio-blk.c | 463
> +++++++++++++++++++++++++++++++++++++++++++++
> hw/dataplane/virtio-blk.h | 43 +++++
> hw/dataplane/vring.c | 361 +++++++++++++++++++++++++++++++++++
> hw/dataplane/vring.h | 63 ++++++
> hw/virtio-blk.c | 28 ++-
> hw/virtio-blk.h | 1 +
> hw/virtio-pci.c | 3 +
> iov.c | 80 ++++++--
> iov.h | 13 ++
> qemu-common.h | 3 +
> tests/test-iov.c | 129 +++++++++++++
> trace-events | 9 +
> 23 files changed, 1805 insertions(+), 14 deletions(-)
> create mode 100644 hw/dataplane/Makefile.objs
> create mode 100644 hw/dataplane/event-poll.c
> create mode 100644 hw/dataplane/event-poll.h
> create mode 100644 hw/dataplane/hostmem.c
> create mode 100644 hw/dataplane/hostmem.h
> create mode 100644 hw/dataplane/ioq.c
> create mode 100644 hw/dataplane/ioq.h
> create mode 100644 hw/dataplane/virtio-blk.c
> create mode 100644 hw/dataplane/virtio-blk.h
> create mode 100644 hw/dataplane/vring.c
> create mode 100644 hw/dataplane/vring.h
>
> --
> 1.8.0.1