Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in

From:	Denis V. Lunev
Subject:	Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign
Date:	Mon, 11 May 2015 19:38:58 +0300
User-agent:	Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 11/05/15 19:07, Denis V. Lunev wrote:

On 11/05/15 18:08, Stefan Hajnoczi wrote:

On Mon, May 04, 2015 at 04:42:22PM +0300, Denis V. Lunev wrote:

The difference is quite reliable and the same 5%.
   qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
for image in qcow2 format is 1% faster.

I looked a little at the qemu-io invocation but am not clear why there
would be a measurable performance difference.  Can you explain?

What about real qemu-img or QEMU use cases?

I'm okay with the patches themselves, but I don't really understand why
this code change is justified.

Stefan

There is a problem in the Linux kernel when the buffer
is not aligned to the page size. Actually the strict requirement
is the alignment to the 512 (one physical sector).

This comes into the account in qemu-img and qemu-io
when buffers are allocated inside the application. QEMU
is free of this problem as the guest sends buffers
aligned to page already.

You can see below results of qemu-img, they are exactly
the same as for qemu-io.

qemu-img create -f qcow2 1.img 64G
qemu-io -n -c 'write -P 0xaa 0 1G' 1.img

time for i in `seq 1 30` ; do /home/den/src/qemu/qemu-img convert1.img -t none -O raw 2.img ; rm -rf 2.img ; done


==== without patches ====:
real    2m6.287s
user    0m1.322s
sys    0m8.819s

real    2m7.483s
user    0m1.614s
sys    0m9.096s

==== with patches ====:
real    1m59.715s
user    0m1.453s
sys    0m9.365s

real    1m58.739s
user    0m1.419s
sys    0m8.530s

I could not exactly say where the difference comes, but
the problem comes from the fact that real IO operation
over the block device should be
  a) page aligned for the buffer
  b) page aligned for the offset
This is how buffer cache is working in the kernel. And
with non-aligned buffer in userspace the kernel should collect
kernel page for IO from 2 userspaces pages instead of one.
Something is not optimal here I presume. I can assume
that the user page could be sent immediately to the
controller is buffer is aligned and no additional memory
allocation is needed. Though I don't know exactly.

Regards,
    Den

Pls see attached BLK traces, they describes everything!

Test command:
/home/den/src/qemu/qemu-img convert 1.img -t none -O raw 2.img

In general, not patched qemu-img IO pattern looks like this:

9,0 11 1 0.000000000 11151 Q WS 312737792 + 1023[qemu-img]

  9,0   11        2     0.000007938 11151  Q  WS 312738815 + 8 [qemu-img]

9,0 11 3 0.000030735 11151 Q WS 312738823 + 1016[qemu-img]

  9,0   11        4     0.000032482 11151  Q  WS 312739839 + 8 [qemu-img]

9,0 11 5 0.000041379 11151 Q WS 312739847 + 1016[qemu-img]

  9,0   11        6     0.000042818 11151  Q  WS 312740863 + 8 [qemu-img]

9,0 11 7 0.000051236 11151 Q WS 312740871 + 1017[qemu-img]9,0 5 1 0.169071519 11151 Q WS 312741888 + 1023[qemu-img]

  9,0    5        2     0.169075331 11151  Q  WS 312742911 + 8 [qemu-img]

9,0 5 3 0.169085244 11151 Q WS 312742919 + 1016[qemu-img]

  9,0    5        4     0.169086786 11151  Q  WS 312743935 + 8 [qemu-img]

9,0 5 5 0.169095740 11151 Q WS 312743943 + 1016[qemu-img]


and patched one:

9,0 6 1 0.000000000 12422 Q WS 314834944 + 1024[qemu-img]9,0 6 2 0.000038527 12422 Q WS 314835968 + 1024[qemu-img]9,0 6 3 0.000072849 12422 Q WS 314836992 + 1024[qemu-img]9,0 6 4 0.000106276 12422 Q WS 314838016 + 1024[qemu-img]9,0 2 1 0.171038202 12422 Q WS 314839040 + 1024[qemu-img]9,0 2 2 0.171073156 12422 Q WS 314840064 + 1024[qemu-img]


Thus the load to the disk is MUCH higher without the patch!

Regards,
    Den

non-patched.blk
Description: Text document

patched.blk
Description: Text document

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, (continued)
- Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Stefan Hajnoczi, 2015/05/11
  - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Denis V. Lunev, 2015/05/11
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Denis V. Lunev, 2015/05/11
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Stefan Hajnoczi, 2015/05/12
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Denis V. Lunev, 2015/05/12
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Paolo Bonzini, 2015/05/12
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Stefan Hajnoczi, 2015/05/13
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Denis V. Lunev, 2015/05/13
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Paolo Bonzini, 2015/05/29
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Paolo Bonzini, 2015/05/14
    - Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign, Denis V. Lunev <=

Prev by Date: Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/5] raw-posix: DPRINTF instead of DEBUG_BLOCK_PRINT
Next by Date: Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/6] block: Remove bdrv_reset_dirty
Previous by thread: Re: [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign
Next by thread: [Qemu-block] [PATCH v4 00/17] qcow2: Add new overlap check functions
Index(es):
- Date
- Thread