qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue


From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH v5 2/4] virtio-pci: Use ioeventfd for virtqueue notify
Date: Tue, 25 Jan 2011 13:51:04 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Lightning/1.0b1 Thunderbird/3.0.10

On 01/25/2011 01:45 PM, Stefan Hajnoczi wrote:
On Tue, Jan 25, 2011 at 7:18 PM, Anthony Liguori
<address@hidden>  wrote:
On 01/25/2011 03:49 AM, Stefan Hajnoczi wrote:
On Tue, Jan 25, 2011 at 7:12 AM, Stefan Hajnoczi<address@hidden>
  wrote:

On Mon, Jan 24, 2011 at 8:05 PM, Kevin Wolf<address@hidden>    wrote:

Am 24.01.2011 20:47, schrieb Michael S. Tsirkin:

On Mon, Jan 24, 2011 at 08:48:05PM +0100, Kevin Wolf wrote:

Am 24.01.2011 20:36, schrieb Michael S. Tsirkin:

On Mon, Jan 24, 2011 at 07:54:20PM +0100, Kevin Wolf wrote:

Am 12.12.2010 16:02, schrieb Stefan Hajnoczi:

Virtqueue notify is currently handled synchronously in userspace
virtio.  This
prevents the vcpu from executing guest code while hardware
emulation code
handles the notify.

On systems that support KVM, the ioeventfd mechanism can be used to
make
virtqueue notify a lightweight exit by deferring hardware emulation
to the
iothread and allowing the VM to continue execution.  This model is
similar to
how vhost receives virtqueue notifies.

The result of this change is improved performance for userspace
virtio devices.
Virtio-blk throughput increases especially for multithreaded
scenarios and
virtio-net transmit throughput increases substantially.

Some virtio devices are known to have guest drivers which expect a
notify to be
processed synchronously and spin waiting for completion.  Only
enable ioeventfd
for virtio-blk and virtio-net for now.

Care must be taken not to interfere with vhost-net, which uses host
notifiers.  If the set_host_notifier() API is used by a device
virtio-pci will disable virtio-ioeventfd and let the device deal
with
host notifiers as it wishes.

After migration and on VM change state (running/paused)
virtio-ioeventfd
will enable/disable itself.

  * VIRTIO_CONFIG_S_DRIVER_OK ->    enable virtio-ioeventfd
  * !VIRTIO_CONFIG_S_DRIVER_OK ->    disable virtio-ioeventfd
  * virtio_pci_set_host_notifier() ->    disable virtio-ioeventfd
  * vm_change_state(running=0) ->    disable virtio-ioeventfd
  * vm_change_state(running=1) ->    enable virtio-ioeventfd

Signed-off-by: Stefan Hajnoczi<address@hidden>

On current git master I'm getting hangs when running iozone on a
virtio-blk disk. "Hang" means that it's not responsive any more and
has
100% CPU consumption.

I bisected the problem to this patch. Any ideas?

Kevin

Does it help if you set ioeventfd=off on command line?

Yes, with ioeventfd=off it seems to work fine.

Kevin

Then it's the ioeventfd that is to blame.
Is it the io thread that consumes 100% CPU?
Or the vcpu thread?

I was building with the default options, i.e. there is no IO thread.

Now I'm just running the test with IO threads enabled, and so far
everything looks good. So I can only reproduce the problem with IO
threads disabled.

Hrm...aio uses SIGUSR2 to force the vcpu to process aio completions
(relevant when --enable-io-thread is not used).  I will take a look at
that again and see why we're spinning without checking for ioeventfd
completion.

Here's my understanding of --disable-io-thread.  Added Anthony on CC,
please correct me.

When I/O thread is disabled our only thread runs guest code until an
exit request is made.  There are synchronous exit cases like a halt
instruction or single step.  There are also asynchronous exit cases
when signal handlers use qemu_notify_event(), which does cpu_exit(),
to set env->exit_request = 1 and unlink the current tb.

Correct.

Note that this is a problem today.  If you have a tight loop in TCG and you
have nothing that would generate a signal (no pending disk I/O and no
periodic timer) then the main loop is starved.
Even with KVM we can spin inside the guest and get a softlockup due to
the dynticks race condition shown above.  In a CPU bound guest that's
doing no I/O it's possible to go AWOL for extended periods of time.

This is a different race.  I need to look more deeply into the code.

I can think of two solutions:
1. Block SIGALRM during critical regions, not sure if the necessary
atomic signal mask capabilities are there in KVM.  Haven't looked at
TCG yet either.
2. Make a portion of the timer code signal-safe and rearm the timer
from within the SIGLARM handler.

Or, switch to timerfd and stop using a signal based alarm timer.

Regards,

Anthony Liguori



Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]