qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Linux multiqueue block layer thoughts


From: Jens Axboe
Subject: Re: [Qemu-devel] Linux multiqueue block layer thoughts
Date: Wed, 27 Nov 2013 19:15:13 -0700

On Wed, Nov 27 2013, Stefan Hajnoczi wrote:
> I finally got around to reading the Linux multiqueue block layer paper
> and wanted to share some thoughts about how it relates to QEMU and
> dataplane/QContext:
> http://kernel.dk/blk-mq.pdf
> 
> I think Jens has virtio-blk multiqueue patches.  So let's imagine that
> the virtio-blk device has multiple virtqueues.  (virtio-scsi is
> already multiqueue BTW.)
> 
> The paper focusses on two queue mappings: 1 queue per core and 1 queue
> per node.  In both cases the idea is to keep the block I/O code path
> localized.  This makes block I/O scale as the number of CPUs
> increases.
> 
> In QEMU we'd want to set up a mapping for the virtio-blk mq device:
> each guest vcpu or guest node has a virtio-blk virtqueue which is
> serviced by a dataplane/QContext thread.
> 
> QEMU would then process requests across these queues in parallel,
> although currently BlockDriverState is not thread-safe.  At least for
> raw we should be able to submit requests in parallel from QEMU.
> 
> Unfortunately there are some complications in the QEMU block layer:
> QEMU's own accounting, request tracking, and throttling features are
> global.  We'd need to eventually do something similar to the
> multiqueue block layer changes in the kernel to detangle this state.
> 
> Doing multiqueue for image formats is much more challenging - we'd
> have to tackle thread-safety in qcow2 and friends.  For network block
> drivers like Gluster or NBD it's also not 100% clear what the best
> approach is.  But I think the target here is local SSDs that are
> capable of high IOPs together with an SMP guest.
> 
> At the end of all this we'd arrive at the following architecture:
> 1. Guest virtio device has multiple queues (1 per node or vcpu).
> 2. QEMU has multiple dataplane/QContext threads that process virtqueue
> kicks, they are bound to host CPUs/nodes.
> 3. Linux kernel has multiqueue block I/O.

I think that sounds very reasonable. Let me know if there's anything you
need help or advice with.

> Jens: when experimenting with multiqueue virtio-blk, how far did you
> modify QEMU to eliminate global request processing state from block.c?

I did very little scaling testing on virtio-blk, it was more a demo case
for conversion than anything else. So probably not of much use to what
you are looking for...

-- 
Jens Axboe




reply via email to

[Prev in Thread] Current Thread [Next in Thread]