|
From: | Denis V. Lunev |
Subject: | Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign |
Date: | Mon, 11 May 2015 19:07:17 +0300 |
User-agent: | Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 |
On 11/05/15 18:08, Stefan Hajnoczi wrote:
On Mon, May 04, 2015 at 04:42:22PM +0300, Denis V. Lunev wrote:The difference is quite reliable and the same 5%. qemu-io -n -c 'write -P 0xaa 0 1G' 1.img for image in qcow2 format is 1% faster.I looked a little at the qemu-io invocation but am not clear why there would be a measurable performance difference. Can you explain? What about real qemu-img or QEMU use cases? I'm okay with the patches themselves, but I don't really understand why this code change is justified. Stefan
There is a problem in the Linux kernel when the buffer is not aligned to the page size. Actually the strict requirement is the alignment to the 512 (one physical sector). This comes into the account in qemu-img and qemu-io when buffers are allocated inside the application. QEMU is free of this problem as the guest sends buffers aligned to page already. You can see below results of qemu-img, they are exactly the same as for qemu-io. qemu-img create -f qcow2 1.img 64G qemu-io -n -c 'write -P 0xaa 0 1G' 1.imgtime for i in `seq 1 30` ; do /home/den/src/qemu/qemu-img convert 1.img -t none -O raw 2.img ; rm -rf 2.img ; done
==== without patches ====: real 2m6.287s user 0m1.322s sys 0m8.819s real 2m7.483s user 0m1.614s sys 0m9.096s ==== with patches ====: real 1m59.715s user 0m1.453s sys 0m9.365s real 1m58.739s user 0m1.419s sys 0m8.530s I could not exactly say where the difference comes, but the problem comes from the fact that real IO operation over the block device should be a) page aligned for the buffer b) page aligned for the offset This is how buffer cache is working in the kernel. And with non-aligned buffer in userspace the kernel should collect kernel page for IO from 2 userspaces pages instead of one. Something is not optimal here I presume. I can assume that the user page could be sent immediately to the controller is buffer is aligned and no additional memory allocation is needed. Though I don't know exactly. Regards, Den
[Prev in Thread] | Current Thread | [Next in Thread] |