qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 04/13] virtio: poll virtqueues for new buffer


From: Christian Borntraeger
Subject: Re: [Qemu-devel] [PATCH v4 04/13] virtio: poll virtqueues for new buffers
Date: Mon, 5 Dec 2016 12:12:49 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

On 12/05/2016 11:14 AM, Stefan Hajnoczi wrote:
> On Mon, Dec 05, 2016 at 08:54:59AM +0800, Fam Zheng wrote:
>> On Thu, 12/01 19:26, Stefan Hajnoczi wrote:
>>> Add an AioContext poll handler to detect new virtqueue buffers without
>>> waiting for a guest->host notification.
>>>
>>> Signed-off-by: Stefan Hajnoczi <address@hidden>
>>> ---
>>>  hw/virtio/virtio.c | 16 +++++++++++++++-
>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>>> index 6f8ca25..3870411 100644
>>> --- a/hw/virtio/virtio.c
>>> +++ b/hw/virtio/virtio.c
>>> @@ -2047,13 +2047,27 @@ static void 
>>> virtio_queue_host_notifier_aio_read(EventNotifier *n)
>>>      }
>>>  }
>>>  
>>> +static bool virtio_queue_host_notifier_aio_poll(void *opaque)
>>> +{
>>> +    EventNotifier *n = opaque;
>>> +    VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
>>> +
>>> +    if (virtio_queue_empty(vq)) {
>>
>> I notice this call path gets very hot in the profile:
>>
>> #0  address_space_translate_internal (d=0x7f853c22e970, addr=2140788738, 
>>     address@hidden, address@hidden, 
>>     address@hidden) at /root/qemu-nvme/exec.c:419
>> #1  0x00007f854d3e8449 in address_space_translate (as=<optimized out>, 
>> addr=2140788738, 
>>     address@hidden, address@hidden, 
>>     address@hidden) at /root/qemu-nvme/exec.c:462
>> #2  0x00007f854d3ee2fd in address_space_lduw_internal 
>> (endian=DEVICE_LITTLE_ENDIAN, 
>>     result=0x0, attrs=..., addr=<optimized out>, as=<optimized out>)
>>     at /root/qemu-nvme/exec.c:3299
>> #3  address_space_lduw_le (result=0x0, attrs=..., addr=<optimized out>, 
>> as=<optimized out>)
>>     at /root/qemu-nvme/exec.c:3351
>> #4  lduw_le_phys (as=<optimized out>, addr=<optimized out>) at 
>> /root/qemu-nvme/exec.c:3369
>> #5  0x00007f854d475978 in virtio_lduw_phys (vdev=<optimized out>, 
>> pa=<optimized out>)
>>     at /root/qemu-nvme/include/hw/virtio/virtio-access.h:46
>> #6  vring_avail_idx (vq=<optimized out>, vq=<optimized out>)
>>     at /root/qemu-nvme/hw/virtio/virtio.c:143
>> #7  virtio_queue_empty (address@hidden) at 
>> /root/qemu-nvme/hw/virtio/virtio.c:246
>> #8  0x00007f854d475d20 in virtio_queue_empty (vq=0x7f854fd0c9e0)
>>     at /root/qemu-nvme/hw/virtio/virtio.c:2074
>> #9  virtio_queue_host_notifier_aio_poll (opaque=0x7f854fd0ca48)
>>     at /root/qemu-nvme/hw/virtio/virtio.c:2068
>> #10 0x00007f854d69116e in run_poll_handlers_once (ctx=<optimized out>) at 
>> aio-posix.c:493
>> #11 0x00007f854d691b08 in run_poll_handlers (max_ns=<optimized out>, 
>> ctx=0x7f854e908850)
>>     at aio-posix.c:530
>> #12 try_poll_mode (blocking=true, ctx=0x7f854e908850) at aio-posix.c:558
>> #13 aio_poll (ctx=0x7f854e908850, address@hidden) at aio-posix.c:601
>> #14 0x00007f854d4ff12e in iothread_run (opaque=0x7f854e9082f0) at 
>> iothread.c:53
>> #15 0x00007f854c249dc5 in start_thread () from /lib64/libpthread.so.0
>> #16 0x00007f854acd973d in clone () from /lib64/libc.so.6
>>
>> Is it too costly to do an address_space_translate per poll?
> 
> Good insight.
> 
> If we map guest RAM we should consider mapping the vring in
> hw/virtio/virtio.c.  That way not just polling benefits.
> 
> The map/unmap API was not designed for repeated memory accesses.  In the
> bounce buffer case you have a stale copy - repeated reads will not
> update it.  Vrings should be in guest RAM so we don't really need to
> support bounce buffering but it's still a hack to use the map/unmap API.
> 
> Paolo: Thoughts about the memory API?


Yes, I see the same on s390. This path is the main CPU eater in the host,
even for non polling as soon as you have fast enough disks. If we could
"abuse" polling to actually lower this overhead that would be interesting.

On the other hand, reducing the overhead of the memory API would be even
better.


Christian




reply via email to

[Prev in Thread] Current Thread [Next in Thread]