Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to

From:	Vincenzo Maffione
Subject:	Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs
Date:	Wed, 16 Dec 2015 11:39:46 +0100

2015-12-16 10:34 GMT+01:00 Paolo Bonzini <address@hidden>:
>
>
> On 16/12/2015 10:28, Vincenzo Maffione wrote:
>> Assuming my TX experiments with disconnected backend (and I disable
>> CPU dynamic scaling of performance, etc.):
>>   1) after patch 1 and 2, virtio bottleneck jumps from ~1Mpps to 1.910 Mpps.
>>   2) after patch 1,2 and 3, virtio bottleneck jumps to 2.039 Mpps.
>>
>> So I see an improvement for patch 3, and I guess it's because we avoid
>> an additional memory translation and related overhead. I believe that
>> avoiding the memory translation is more beneficial than avoiding the
>> variable-sized memcpy.
>> I'm not surprised of that, because taking a brief look at what happens
>> under the hood when you call an access_memory() function - it looks
>> like a lot of operations.
>
> Great, thanks for confirming!
>
> Paolo

No problems.

I have some additional (orthogonal) curiosities:

  1) Assuming "hw/virtio/dataplane/vring.c" is what I think it is (VQ
data structures directly accessible in the host virtual memory, with
guest-phyisical-to-host-virtual mapping done statically at setup time)
why isn't QEMU using this approach also for virtio-net? I see it is
used by virtio-blk only.

  2) In any case (vring or not) QEMU dynamically maps data buffers
from guest physical memory, for each descriptor to be processed: e1000
uses pci_dma_read/pci_dma_write, virtio uses
cpu_physical_memory_map()/cpu_physical_memory_unmap(), vring uses the
more specialied vring_map()/vring_unmap(). All of these go through
expensive lookups and related operations to do the address
translation.
Have you considered the possibility to cache the translation result to
remove this bottleneck (maybe just for virtio devices)? Or is any
consistency or migration-related problem that would create issues?
Just to give an example of what I'm talking about:
https://github.com/vmaffione/qemu/blob/master/hw/net/e1000.c#L349-L423.

At very high packet rates, once notifications (kicks and interrupts)
have been amortized in some way, memory translation becomes the major
bottleneck. And this (1 and 2) is why QEMU virtio implementation
cannot achieve the same throughput as bhyve does (5-6 Mpps or more
IIRC).

Cheers,
  Vincenzo

-- 
Vincenzo Maffione

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Vincenzo Maffione, 2015/12/15
- [Qemu-devel] [PATCH v2 3/3] virtio: combine write of an entry into used ring, Vincenzo Maffione, 2015/12/15
- [Qemu-devel] [PATCH v2 2/3] virtio: read avail_idx from VQ only when necessary, Vincenzo Maffione, 2015/12/15
- [Qemu-devel] [PATCH v2 1/3] virtio: cache used_idx in a VirtQueue field, Vincenzo Maffione, 2015/12/15
- Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Paolo Bonzini, 2015/12/16
  - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Vincenzo Maffione, 2015/12/16
    - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Paolo Bonzini, 2015/12/16
    - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Vincenzo Maffione <=
    - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Michael S. Tsirkin, 2015/12/16
    - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Vincenzo Maffione, 2015/12/16
    - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Paolo Bonzini, 2015/12/16
    - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Vincenzo Maffione, 2015/12/16
    - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Paolo Bonzini, 2015/12/16
    - Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs, Vincenzo Maffione, 2015/12/30

Prev by Date: Re: [Qemu-devel] [PATCH 5/6] Use qemu_get_buffer_in_place for xbzrle data
Next by Date: [Qemu-devel] [PATCH] iov: avoid memcpy for "simple" iov_from_buf/iov_to_buf
Previous by thread: Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs
Next by thread: Re: [Qemu-devel] [PATCH v2 0/3] virtio: proposal to optimize accesses to VQs
Index(es):
- Date
- Thread