qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Geometry and blocksize support for backing devices


From: Ekaterina Tumanova
Subject: Re: [Qemu-devel] Geometry and blocksize support for backing devices
Date: Fri, 07 Nov 2014 16:39:34 +0300
User-agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

On 11/07/2014 12:17 PM, Markus Armbruster wrote:
Christian Borntraeger <address@hidden> writes:

Markus, Kevin, Stefan,

here is a (somewhat late) followup of some KVM forum discussions regarding
block size and geometry of pass-through block devices. Let's just do a quick
wrap-up (as of my understanding) and a proposal at the end of the mail





- DASD/ECKD disk devices have several special properties, e.g. the geometry
   still has an influence on the partitions and each track can have a different
   format and the format of the disk actually follows what z/OS has as basic
   data structures. Linux does a low-level format of the disk to look like a
   block device (most common case is formatted with 4K blocks). There are
   still some smalls warts for z/OS compatibility (block0 has 28 bytes of data,
   block1 has 148 bytes and block2 has 84 bytes of data. everything else has
   4k. the dasd device driver will fake 4k blocks for all blocks by ignoring
   writes beyond the data for blocks 0-2 and filling the gaps with 0xe5).

- Since Linux uses DASDs as normal block device, we actually want to use
   virtio-blk to pass those to KVM guests. Due to the warts mentioned above,
   we have to have the proper block size and geometry in the guest. Otherwise
   things will fail in certain cases (linux partition detection for example)

- we have libvirt support to provide this information in the XML. This is far
   from user friendly, though, as the admin has to manually look up the
   properties of the host disks and then modify the guest definition 
accordingly.

- Kate came up with patches (based on initial patches from Einar Lueck) for
   auto-detection of geometry and block size for host block devices

- Stefan and Paolo had some concerns:
1. if the geometry etc is important, then make it part of the guest definition
2. what about migration and the target disk differs
3. is that issue system z specific or generic?

Regarding 1: this does work as of today, but it is pretty complicated for an
admin to do so

Regarding 2: System z system do not have built-in disks, they are always
accessed via fibre channel (either FICON protocol for DASDs or FCP protocol
for scsi). So its quite common to have shared access to the same disk from
different System z boxes. system z admins should be able to setup this
properly. Question is, is it ok to assume that and fail if not?

Regarding 3: No idea.

At KVM forum I talked to several people regarding a solution:

a) Stefan suggested to make the auto detection explicit, e.g.: provide a
"autodetect" tag for the secs, cylc, heads and logical_sector_size properties
This would require changes in qemu, but also in libvirt and its domain
configuration
b) Markus suggested that there are already some cases in QEMU, where we rely
on the admin to provide a proper setup on the target, e.g. an .iso
file as image.
If the target has a different content in the .iso file things will break.

Markus said, there are two classes of this.
a) problems that can be detected by QEMU. Here QEMU will abort migration

Example: device missing on target, target rejects migration when it
receives the device's state.

b) problems that cannot be detected by QEMU (e.g. different iso content). this
will trigger failures later on

a is preferred

Note that the geometry is currently in class (b).  Configuration
generally is.

A perpetual long-term goal of migration is embedding configuration in
the migration stream, to move it from (b) to (a).  Just frontend
configuration, because backend configuration is generally host-specific.

Geometry is a property of the device, thus frontend configuration.

For historical reasons, device geometry properties default to backend
values set with -drive cyls=... or -hdachs.  This is deprecated.

If the user doesn't specify geometry, QEMU makes one up, usually based
on device size.  In certain circumstances, it bases on a DOS partition
table instead.  Misfeature, in my opinion.  Partitioning a disk can give
you a different geometry on the next restart, including migration,
unless you specify the geometry explicitly.  Fortunately, most guests
don't care for geometry at all.  This is entirely undocumented, as far
as I can tell.

Now here comes my proposal:
Markus statement brought up an idea of special casing DASDs support. We can
call an ioctl BIODASDINFO on the block device that will only succeed if the host
disk is really a dasd. We could enable the auto detection for that case.

If BIODASDINFO succeeds, QEMU uses that instead of making up device
geometry as described above.  Correct?

Let's spell out when exactly BIODASDINFO succeeds, to avoid
misunderstandings.  It does when the backend is a DASD (/dev/dasd*).
What about a partition on a DASD?  A file in a filesystem on a DASD?

Auto-detecting geometry on DASDs adds an irregularity to the user
interface.  I guess DASDs are special enough to tolerate that.

In addition, QEMU will check if geometry and block size match during
migration, if
not, migration will fail. That would work with the following cases

(manual override == secs, cyls, headers, blocksize given by admin)

HOST A                          HOST B
dasd (auto)            ----->   dasd (auto)
dasd (auto)            ----->   image file (manual override)
image file (manual)    ----->   dasd (auto)
image file (manual)    ----->   image file (manual)
dasd (auto)            ----->   other host block device with manual override

if there are different dasds or different value migration will fail (and that is
what we want)

I guess I helped plant this idea.  Thinking about it again, I fear doing
it just for geometry could be shortsighted.  When we do it for all
device configuration later on, a geometry special case could get in the
way.

While you're certainly welcome to take on one of migration's long-term
projects, I don't want to make it a prerequisite for getting DASDs more
usable ;)

Without this sanity check, we gain another way to mess up geometry by
changing the default geometry (see "DOS partition table" above for the
existing way).  We reuse the same answer: don't do that then.

Good enough for Paolo, if I understand him correctly.  I guess it's good
enough for me too, but please document it.  Covering the whole default
geometry machinery would be nice.

In addition we could also implement Stefans proposal to add a
"autodetect" statement
for secs, cyls, head.... but I am not sure about libvirt support, though

So 3 parts:
1. autodetect on real DASDs
2. geometry and sector size checking in generic code
3. maybe an autodetect flag

makes sense? Any guidance how to proceed?

Try posting a patch just for 1., and see if anyone screams :)


Thanks a lot for the advise!
A small follow-up question...

Since the whole thing becomes arch-specific again (or even DASD-specific), I suppose I no longer need to change the driver code.

In my most recent patchset geometry was detected by HDIO_GETGEO inside
hw/block/hd-geometry.c (via arch-specific hook)
But... for the blocksize the control was passed to the driver functions (probe_logical_blocksize, probe_physical_blocksize) which called appropriate ioctls (BLKPBSZGET, BLKSSZGET) for "raw" and "host devices"

I suppose, that now there's no need for that anymore I can move the block ioctl calls back to hw/block/block.c and remove the introduced driver functions.

What do you think?

Kate.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]