qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Linux kernel polling for QEMU


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] Linux kernel polling for QEMU
Date: Wed, 30 Nov 2016 09:38:00 +0000
User-agent: Mutt/1.7.1 (2016-10-04)

On Wed, Nov 30, 2016 at 01:42:14PM +0800, Fam Zheng wrote:
> On Tue, 11/29 20:43, Stefan Hajnoczi wrote:
> > On Tue, Nov 29, 2016 at 1:24 PM, Fam Zheng <address@hidden> wrote:
> > > On Tue, 11/29 12:17, Paolo Bonzini wrote:
> > >> On 29/11/2016 11:32, Fam Zheng wrote:
> > >> * it still needs a system call before polling is entered.  Ideally, QEMU
> > >> could run without any system call while in polling mode.
> > >>
> > >> Another possibility is to add a system call for single_task_running().
> > >> It should be simple enough that you can implement it in the vDSO and
> > >> avoid a context switch.  There are convenient hooking points in
> > >> add_nr_running and sub_nr_running.
> > >
> > > That sounds good!
> > 
> > With this solution QEMU can either poll virtqueues or the host kernel
> > can poll NIC and storage controller descriptor rings, but not both at
> > the same time in one thread.  This is one of the reasons why I think
> > exploring polling in the kernel makes more sense.
> 
> That's true. I have one question though: controller rings are in a different
> layer in the kernel, I wonder what the syscall interface looks like to ask
> kernel to poll both hardware rings and memory locations in the same loop? It's
> not obvious to me after reading your eventfd patch.

Current descriptor ring polling in select(2)/poll(2) is supported for
network sockets.  Take a look at the POLL_BUSY_LOOP flag in
fs/select.c:do_poll().  If the .poll() callback sets the flag then it
indicates that the fd supports busy loop polling.

The way this is implemented for network sockets is that the socket looks
up the napi index and is able to use the NIC driver to poll the rx ring.
Then it checks whether the socket's receive queue contains data after
the rx ring was processed.

The virtio_net.ko driver supports this interface, for example.  See
drivers/net/virtio_net.c:virtnet_busy_poll().

Busy loop polling isn't supported for block I/O yet.  There is currently
a completely independent code path for O_DIRECT synchronous I/O where
NVMe can poll for request completion.  But it doesn't work together with
asynchronous I/O (e.g. Linux AIO using eventfd with select(2)/poll(2)).

> > The disadvantage of the kernel approach is that you must make the
> > ppoll(2)/epoll_wait(2) syscall even for polling, and you probably need
> > to do eventfd reads afterwards so the minimum event loop iteration
> > latency is higher than doing polling in userspace.
> 
> And userspace drivers powered by dpdk or vfio will still want to do polling in
> userspace anyway, we may want to take that into account as well.

vfio supports interrupts so it can definitely be integrated with
adaptive kernel polling (i.e. poll for a little while and then wait for
an interrupt if there was no event).

Does dpdk ever use interrupts?

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]