qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 0/5] Geometry and blocksize detection for bac


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v5 0/5] Geometry and blocksize detection for backing devices.
Date: Tue, 13 Jan 2015 16:04:03 +0000
User-agent: Mutt/1.5.23 (2014-03-12)

On Tue, Jan 13, 2015 at 11:51:51AM +0100, Markus Armbruster wrote:
> Christian Borntraeger <address@hidden> writes:
> 
> > Am 02.01.2015 um 13:57 schrieb Stefan Hajnoczi:
> >> On Thu, Dec 18, 2014 at 04:59:50PM +0100, Christian Borntraeger wrote:
> >>> Are you ok with the patches? If yes, can you take care of these
> >>> patches in the block tree?
> >> 
> >> This series looks close, I've left comments on the patches.
> >
> > OK, so you would take care of the series in your tree if the next spin
> > addresses your comments, right?
> >> 
> >> The series is fine for command-line QEMU users where probing makes the
> >> command-line more convenient, so we can merge it.  But the approach is
> >> fundamentally wrong for stacks where libvirt is in use.
> >> Libvirt is unaware of the guest geometry and block sizes that are probed
> >> in QEMU by this patch series.  This breaks non-shared storage migration
> >> and also means libvirt-based tools that manipulate drives on a guest may
> >> inadvertently change the guest-visible geometry and cause disk problems.
> >> 
> >> For example, what happens when you copy the disk image off a host DASD
> >> and onto NFS?  QEMU no longer probes the geometry and the disk geometry
> >> has changed.
> >> 
> >> The right place to tackle guest-visible geometry is in libvirt, not in
> >> QEMU, because it is guest state the needs to be captured in domain XML
> >> so that migration and tooling can preserve it when manipulating guests.
> >> 
> >> Stefan
> >> 
> >
> > I agree that this is not perfect and has obvious holes, but  it works 
> > just fine with libvirt stacks that are typical on s390 (*). scsi disks
> > (emulated and real) are not affected and work just fine. DASD disks are
> > special anyway - please consider this as some kind of real HW pass-through
> > even when we use virtio-blk. We can assume that admins can provide
> > shared access between source and target. If you look at the real HW (LPAR
> > and z/VM) DASDs are always attached via fibre channel (FICON protocol)
> > (often with SAN switches) so shared access is quite common even over longer
> > distances.
> 
> I wouldn't quite call it pass-through, but I agree that DASDs are
> special.
> 
> > And yes, using NFS or image file will break unless the user specifies the
> > geometry in libvirt. Setting these values is possible as of today in 
> > libvirts
> > XML. But what programmatic way should libvirt use to detect that 
> > information 
> > itself? On an image file libvirt doesnt know either. This would be somewhat
> > possible on an upper level like open stack and that upper level would then
> > need to talk with the DS8000 storage subsystems and the system z hardware 
> > but
> > even then it would fail when creating new DASD like images.  (This would 
> > boil
> > down to provide an interface like "create an image file that looks like as
> > DASD of type 3390/xx formatted with dasdfmt and the following parameters).
> 
> Disk geometry is really a device model property.  For historical
> reasons, we conjure up default geometries in the block layer.  For
> convenience, we special-case DASDs (this series).
> 
> Very few guests care about device geometry.
> 
> If your guest does, relying on default geometry has always exposed you
> to the risk of an unwanted geometry change.  So far largely theoretical,
> as the default geometry guessing algorithm has been unlikely to change
> incompatibily.  To eliminate the risk, specify the geometry explicitly.
> With libvirt, that means putting it into domain XML.
> 
> This series changes *does* change the guessing algorithm, but for DASDs
> only.  Thus, the risk becomes real, but for users of DASDs only.  Since
> said users of DASDs ask for it...  let them have rope, I say.
> 
> Libvirt could default the geometry to whatever QEMU comes up with, then
> fix it forever in domain XML.  Doubtful whether it's worth the bother,
> as very few guests care, and their users are probably (painfully) aware
> of geometry pitfalls anyway.
> 
> > If we talk about cloud/openstack etc we do not have pass through devices 
> > anyway
> > and only image files. If somebody asks me, I would create all image files 
> > as 
> > SCSI images anyway, all the DASD stuff makes sense if you have DASD 
> > hardware 
> > (or if you really know what you are going to do).
> >
> > What is your opinion on 
> > a) migrating disk properties like geometry
> > b) comparing the detected disk properties and reject migration if they
> >    dont match?
> > Both changes  should provide a safety net and could be added at a later
> > point in time.
> 
> If I remember correctly I suggested to investigate in that direction,
> but then we concluded it wasn't worth the bother.

Migrating disk geometry violates how live migration works:

1. Run-time device state like device registers and memory contents are
   live-migrated by QEMU.

2. Machine configuration like the list of emulated devices and
   read-only properties on those devices are not live migrated by QEMU.

Disk geometry is a property of the emulated storage device, so it is not
live migrated.

Why is migrating disk geometry incorrect?  Because if you live migrate
it in QEMU there is no way to persist it on the destination host.  You
will be unable to cold boot the guest on the destination because you
lose the machine configuration when the guest shuts down.

The machine configuration is persisted by the management tool or
manually by the user.  That is where the disk geometry needs to go
during migration.


I'm really starting to get worried that you are going to break things.
This DASD hack is a layering violation but okay, go ahead if you want.
But now you are also thinking about breaking live migration.

The proper thing to do is to introduce libvirt XML syntax for DASD.
That way the geometry can be handled as part of the machine
configuration.  Then live migration and storage management tools can do
the right thing.

I've said this should be done in libvirt repeatedly but you keep wanting
to hack QEMU instead of doing this cleanly :(.

If you have plans to expand on this hack, please scrap this series and
introduce libvirt XML syntax instead.

Does what I'm saying make sense?

Stefan

Attachment: pgpGjyQajsboZ.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]