qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/17] Support mismatched host and guest logical


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 00/17] Support mismatched host and guest logical block sizes
Date: Wed, 14 Dec 2011 13:40:22 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20110930 Thunderbird/7.0.1

On 12/14/2011 01:05 PM, Kevin Wolf wrote:
Am 14.12.2011 12:47, schrieb Paolo Bonzini:
On 12/14/2011 12:13 PM, Kevin Wolf wrote:
As we discussed before, the really interesting point here is defaults,
and whatever you choose to do is wrong in some respect.

So it looks like you chose to make the virtual device default to the
host block size.

... wait wait, I default to 512. :)

Here is the rationale.  512-over-4k may be slow, but is safe (but it is
not slow if you align partitions properly).  4k-over-512 is unsafe.  So,
defaulting to 512 seemed the right thing after all.

Which means bounce buffers by default on 4k hosts.

In practice it doesn't (quite surprisingly).  The patches do the following:

- if the initial and ending sector is aligned, submit directly to paio. Otherwise, only bounce the extra host sectors (up to 2) required to align the operation: the bulk of the request will reuse the guest's data buffer.

- if the buffer is not 4k-aligned, paio will linearize the request with a bounce buffer. This will almost always happen if the initial sector is misaligned, but not if only the ending sector is misaligned.

If the partitions are aligned, the OS will always issue aligned requests, because file system blocks are already 4k. And kernel buffers will usually be page-aligned rather than block-aligned, so paio will also let the request through. You'll see perhaps half a dozen misaligned requests in the whole guest lifetime, for example to read the partition table. So for aligned partitions, the performance difference on iozone was well within statistical noise.

Is this going to
become our next cache=writethrough? At some point 4k disks will be in
wide use, but we'll still be stuck with a slow default of 512.

Unless you switch to EFI, the boot disk has to remain anyway on 512-byte blocks.

No matter what we decide here, I think it might really be a good idea to
save the block size in the image and use that as the default if nothing
else is specified on the command line.

Yeah, that's sensible to do (though it can be a follow-up).

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]