qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] IO performance test on the tcm-vhost scsi


From: Asias He
Subject: Re: [Qemu-devel] IO performance test on the tcm-vhost scsi
Date: Fri, 15 Jun 2012 11:28:20 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0

On 06/14/2012 08:07 PM, Stefan Hajnoczi wrote:
On Thu, Jun 14, 2012 at 05:45:22PM +0800, Cong Meng wrote:
On Thu, 2012-06-14 at 09:30 +0100, Stefan Hajnoczi wrote:
On Wed, Jun 13, 2012 at 11:13 AM, mengcong <address@hidden> wrote:
                    seq-read        seq-write       rand-read     rand-write
                    8k     256k     8k     256k     8k   256k     8k   256k
----------------------------------------------------------------------------
bare-metal          67951  69802    67064  67075    1758 29284    1969 26360
tcm-vhost-iblock    61501  66575    51775  67872    1011 22533    1851 28216
tcm-vhost-pscsi     66479  68191    50873  67547    1008 22523    1818 28304
virtio-blk          26284  66737    23373  65735    1724 28962    1805 27774
scsi-disk           36013  60289    46222  62527    1663 12992    1804 27670


unit: KB/s
seq-read/write = sequential read/write
rand-read/write = random read/write
8k,256k are blocksize of the IO

What strikes me is how virtio-blk performs significantly worse than
bare metal and tcm_vhost for seq-read/seq-write 8k.  The good
tcm_vhost results suggest that the overhead is not the virtio
interface itself, since tcm_vhost implements virtio-scsi.

To drill down on the tcm_vhost vs userspace performance gap we need
virtio-scsi userspace results.  QEMU needs to use the same block
device as the tcm-vhost-iblock benchmark.

Cong: Is it possible to collect the virtio-scsi userspace results
using the same block device as tcm-vhost-iblock and -drive
format=raw,aio=native,cache=none?


virtio-scsi-raw     43065  69729    52052  67378    1757 29419    2024 28135

qemu ....\
-drive file=/dev/sdb,format=raw,if=none,id=sdb,cache=none,aio=native \
-device virtio-scsi-pci,id=mcbus \
-device scsi-disk,drive=sdb

there is only one scsi HBA.
/dev/sdb is the disk on which all tests have been done.

Is this what you want?

Perfect, thanks.  virtio-scsi userspace is much better than virtio-blk
here.  That's unexpected since they both use the QEMU block layer.  If
anything, I would have expected virtio-blk to be faster!

I wonder if the request patterns being sent through virtio-blk and
virtio-scsi are different.  Asias discovered that the guest I/O
scheduler and request merging makes a big difference between QEMU and
native KVM tool performance.  It could be the same thing here which
causes virtio-blk and virtio-scsi userspace to produce quite different
results.

Yes. Cong, can you try this:

echo noop > /sys/block/$disk/queue/scheduler
echo 2 > /sys/block/$disk/queue/nomerges

This will disable the merge in guest kernel. The host side IO processing speed has a very large impact on the guest request pattern, especially for sequential read and write.

The second question is why is tcm_vhost faster than virtio-scsi
userspace.

Stefan



--
Asias





reply via email to

[Prev in Thread] Current Thread [Next in Thread]