[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH 0/2] virtio-scsi: Optimizing request allocation
From: |
Fam Zheng |
Subject: |
[Qemu-devel] [PATCH 0/2] virtio-scsi: Optimizing request allocation |
Date: |
Thu, 11 Sep 2014 18:16:37 +0800 |
Zeroing is relatively expensive since we have big request structures.
VirtQueueElement (> 4k) and sense_buf (256 bytes) are two points to look at.
This visibly reduces overhead of request handling when testing with the
unmerged "null" driver and virtio-scsi dataplane. Before, the issue is very
obvious with perf top:
perf top -G -p `pidof qemu-system-x86_64`
-----------------------------------------
+ 16.50% libc-2.17.so [.] __memset_sse2
+ 2.28% libc-2.17.so [.] _int_malloc
+ 2.25% [vdso] [.] 0x0000000000000cd1
+ 2.02% [kernel] [k] _raw_spin_lock_irqsave
+ 1.97% libpthread-2.17.so [.] pthread_mutex_lock
+ 1.87% libpthread-2.17.so [.] pthread_mutex_unlock
+ 1.81% [kernel] [k] fget_light
+ 1.70% libc-2.17.so [.] malloc
After, the high __memset_sse2 and _int_malloc is gone:
perf top -G -p `pidof qemu-system-x86_64`
-----------------------------------------
+ 4.20% [kernel] [k] vcpu_enter_guest
+ 3.97% [kernel] [k] vmx_vcpu_run
+ 2.63% [kernel] [k] _raw_spin_lock_irqsave
+ 1.72% [kernel] [k] native_read_msr_safe
+ 1.65% [kernel] [k] __srcu_read_lock
+ 1.64% [kernel] [k] _raw_spin_unlock_irqrestore
+ 1.57% [vdso] [.] 0x00000000000008d8
+ 1.49% libc-2.17.so [.] _int_malloc
+ 1.29% libpthread-2.17.so [.] pthread_mutex_unlock
+ 1.26% [kernel] [k] native_write_msr_safe
See the commit message of patch 2 for some fio test data.
Thanks,
Fam
Fam Zheng (2):
scsi: Optimize scsi_req_alloc
virtio-scsi: Optimize virtio_scsi_init_req
hw/scsi/scsi-bus.c | 7 +++++--
hw/scsi/virtio-scsi.c | 17 ++++++++++-------
include/hw/scsi/scsi.h | 21 ++++++++++++++-------
include/hw/virtio/virtio-scsi.h | 1 +
4 files changed, 30 insertions(+), 16 deletions(-)
--
1.9.3
- [Qemu-devel] [PATCH 0/2] virtio-scsi: Optimizing request allocation,
Fam Zheng <=