Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()

From:	Alex Williamson
Subject:	Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()
Date:	Mon, 14 Jun 2010 19:14:01 -0600

On Mon, 2010-06-14 at 23:46 +0100, Paul Brook wrote:
> > > > Ok, I can get it down to something like:
> > > > 
> > > > /i440FX-pcihost/pci.0/virtio-blk-pci,09.0
> > > > 
> > > > The addr on the device is initially a little non-intuitive to me since
> > > > it's a property of the bus, but I guess it make sense if we think of
> > > > that level as slot, which includes an address and driver.
> > > 
> > > That indicates you're thinking about things the wrong way.
> > > The point of this path is to uniquely identify an entity.
> > > 
> > > /i440FX-pcihost/pci0 identifies a PCI bus attached to the i440FX-pcihost
> > > device. To identify a device attached to that bus all you need to know is
> > > the devfn of the device.
> > 
> > Hmm, I think that identifies where the device is, not what the device
> > is.  It's more helpful to say "the e1000 in slot 7" than it is to say
> > "the device in slot 7".
> 
> Why is this more useful? Canonical addresses should not be helpful. They 
> should identify entities within a machine that is already known to be 
> consistent. Making them "helpful" just makes them more volatile.

Being able to check that device 09.0 is attached to the e1000 driver on
source and the rtl8139 driver on the target seems pretty useful to me.
 
> > > For an end-user it may be helpful to allow devices to be identified by
> > > the device type (assuming only one device of a particular type on that
> > > bus), or specify the device type as a crude error checking mechanism.
> > > However we're talking about canonical addresses. These need not include
> > > the device type. Verifying that the device is actually what you expect
> > > is a separate problem, and the device type is not sufficient for that.
> > > 
> > > i.e. /i440FX-pcihost/pci.0/,09.0 Is an appropriate canonical address.
> > 
> > We seem to keep introducing new problems, and I'm not sure this one
> > exists.  If I drop the device name/type and use only the devfn, then
> > what's to prevent the /e1000,09.0/rom (/,09.0/rom) from being stuffed
> > into the /rtl8139,09.0/rom (/,09.0/rom) on a migration?  (or we match it
> > to /09.0/savevm when trying to migrate state)  We can argue that "e1000"
> > isn't a sufficient identifier, but I can't think of a case where it'd
> > fail.
> 
> The migration code needs to check that the devices are actually compatible. 
> I'd expect this to require much more than just the device name. What you 
> actually need is more like "An e1000 with 64k eeprom, fast ethernet PHY, and 
> frobnitz B". 

No, that's savevm's problem, and it's perfectly capable of doing it via
the version ids.  We're not using the driver name to describe just any
random e1000, it's the one that the e1000 driver created, version foo.
If it added a frobnitz in version foo+1, either that savevm knows how to
import a version foo or rejects it.

> In fact what you really want to do is transfer the device tree 
> (including properties), and create the machine from scratch, not load state 
> into a pre-supplied device tree.

Well, I agree, but that's a lot more of an overhaul, and once again
we're changing the problem.

> > > > > > I started down that path, but it still breaks for hotplug.  If we
> > > > > > start a VM with two e1000 NICs, then remove the first, we can no
> > > > > > longer migrate because the target can't represent having a single
> > > > > > e1000 with a non-zero instance ID.
> > > > > 
> > > > > That's indeed a good point.
> > > 
> > > That's a feature. If you start with two NICs and remove the first, the
> > > chances are that the second will be in a different place to the nice
> > > created in a single-nic config. Allowing migration between these two
> > > will cause a PCI device to suddenly change location. This is not
> > > physically or logically possible, and everyone looses.
> > 
> > If the BAR addresses don't follow the VM when it's migrated, that's
> > another bug that needs to be fixed, but we can't get there until we at
> > least have some infrastructure in place to make that bug possible.
> 
> Not BAR addresses, the actual PCI device addresses. Devices on the PCI bus 
> are 
> addressed by device and function.  This is guest visible.  The device part of 
> this address corresponds to the physical slot, which typically effects IRQ 
> routing (amongst other things).  If you arbitrarily move a device from slot A 
> to slot B then this will have catastrophic effects on a running machine.

Sorry, I jumped to BARs because the PCI device address mismatch is kinda
the point of where I'm going.  With these changes, we're at least
allowing that a smart enough management tool is actually able to create
a VM state to match a source instance that has done hotplug operations.
As it is today, I don't think it's possible to migrate a VM that has
gaps in it's savevm instance ids.

Alex

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path(), (continued)

Prev by Date: Re: [Qemu-devel] Re: [SeaBIOS] [PATCHv2] load hpet info for HPET ACPI table from qemu
Next by Date: Re: [Qemu-devel] [PATCH 1/4] savevm: refactor qemu_loadvm_state().
Previous by thread: Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()
Next by thread: Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path()
Index(es):
- Date
- Thread