|
From: | Jens Freimann |
Subject: | Re: [Qemu-devel] [PATCH 0/4] add failover feature for assigned network devices |
Date: | Wed, 12 Jun 2019 13:59:01 +0200 |
User-agent: | NeoMutt/20180716-1376-5d6ed1 |
On Wed, Jun 12, 2019 at 11:11:23AM +0200, Daniel P. Berrangé wrote:
On Tue, Jun 11, 2019 at 11:42:54AM -0400, Laine Stump wrote:On 5/17/19 8:58 AM, Jens Freimann wrote:>> Command line example: > > qemu-system-x86_64 -enable-kvm -m 3072 -smp 3 \ > -machine q35,kernel-irqchip=split -cpu host \ > -k fr \ > -serial stdio \ > -net none \ > -qmp unix:/tmp/qmp.socket,server,nowait \ > -monitor telnet:127.0.0.1:5555,server,nowait \ > -device pcie-root-port,id=root0,multifunction=on,chassis=0,addr=0xa \ > -device pcie-root-port,id=root1,bus=pcie.0,chassis=1 \ > -device pcie-root-port,id=root2,bus=pcie.0,chassis=2 \ > -netdev tap,script=/root/bin/bridge.sh,downscript=no,id=hostnet1,vhost=on \ > -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:6f:55:cc,bus=root2,failover=on \ > /root/rhel-guest-image-8.0-1781.x86_64.qcow2 > > Then the primary device can be hotplugged via > (qemu) device_add vfio-pci,host=5e:00.2,id=hostdev0,bus=root1,standby=net1 I guess this is the commandline on the migration destination, and as far as I understand from this example, on the destination we (meaning libvirt or higher level management application) must *not* include the assigned device on the qemu commandline, but must instead hotplug the device later after the guest CPUs have been restarted on the destination. So if I'm understanding correctly, the idea is that on the migration source, the device may have been hotplugged, or may have been included when qemu was originally started. Then qemu automatically handles the unplug of the device on the source, but it seems qemu does nothing on the destination, leaving that up to libvirt or a higher layer to implement. Then in order for this to work, libvirt (or OpenStack or oVirt or whoever) needs to understand that the device in the libvirt config (it will still be in the libvirt config, since from libvirt's POV it hasn't been unplugged): 1) shouldn't be included in the qemu commandline on the destination,I don't believe that's the case. The CLI args above are just illustrating that it is now possible to *optionally* not specify the VFIO device on the CLI. This is because previous versions of the patchset *always* required the device on the CLI due to a circular dependancy in the CLI syntax. This patch series version fixed that limitation, so now the VFIO device can be cold plugged or hotplugged as desired.
I've mostly tested hotplugging but cold plugged should work as well.
2) will almost surely need to be replaced with a different device on the destination (since it's almost certain that the destination won't have an available device at the same PCI address)Yes, the management application that triggers the migration will need to pass in a new XML document to libvirt when starting the migration so that we use the suitable new device on the target host.
Yes, that's how I expected it to work. In my tests the pci address was the same on destination and source host but that was more by accident. I think the libvirt XML on the destination just needs to have the pciaddress of nic of the same type for it to work.
3) will probably need to be unbinded from the VF net driver (does this need to happen before migration is finished? If we want to lower the probability of a failure after we're already committed to the migration, then I think we must, but libvirt isn't set up for that in any way).
Yes, so I think that's part of the 'partial' unplug I'm trying tofigure out add the moment.
4) will need to be hotplugged after the migration has finished *and* after the guest CPUs have been restarted on the destination.My understanding is that QEMU takes care of this.
So the re-plugging of the device on the destination is not in the v1 of the patches, which I failed to mention, my bad. I will sent out a v2 that has this part as well shortly. I added a runstate change handler that is called on the destination when the run state changes from INMIGRATEto something else. When the new state is RUNNING I hotplug the primary device.
a) there isn't anything in libvirt's XML grammar that allows us to signify a device that is "present in the config but shouldn't be included in the commandline"I don't thin we need that.b) someone will need to replace the device from the source with an equivalent device on the destination in the libvirt XML. There are other cases of management modifying the XML during migration (I think), but this does point out that putting the "auto-unplug code into qemu isn't turning this into a trivialThe mgmt app should pass the new device details in the XML when starting migration. Shouldn't be a big deal as OpenStack already does that for quite a few other parts of the config.c) there is nothing in libvirt's migration logic that can cause a device to be re-binded to vfio-pci prior to completion of a migration. Unless this is added to libvirt (or the re-bind operation is passed off to the management application), we will need to live with the possibility that hotplugging the device will fail due to failed re-bind *after* we've committed to the migration.IIUC, we should be binding to vfio-pci during the prepare phase of the migration, since that's when QEMU is started by libvirt on the target.d) once the guest CPUs are restarted on the destination, [someone] (libvirt or management) needs to hotplug the new device on the destination. (I'm guessing that a hotplug can only be done while the guest CPUs are running; correct me if this is wrong!)I don't believe so, since we'll be able to cold plug it during prepare phase.
I think I don't understand what happens during the prepare phase on the destination. Need to look into that. But I think I had an error in my logic that I need to plug the device from QEMU on the destination side. You're saying we could just always cold plug it directly when the VM is started. I think an exception would be when the guest was migrated before we added the primary device on the source, so before virtio feature negotiation. regards,Jens
[Prev in Thread] | Current Thread | [Next in Thread] |