qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/4] add failover feature for assigned network d


From: Jens Freimann
Subject: Re: [Qemu-devel] [PATCH 0/4] add failover feature for assigned network devices
Date: Wed, 12 Jun 2019 13:59:01 +0200
User-agent: NeoMutt/20180716-1376-5d6ed1

On Wed, Jun 12, 2019 at 11:11:23AM +0200, Daniel P. Berrangé wrote:
On Tue, Jun 11, 2019 at 11:42:54AM -0400, Laine Stump wrote:
On 5/17/19 8:58 AM, Jens Freimann wrote:
>
> Command line example:
>
> qemu-system-x86_64 -enable-kvm -m 3072 -smp 3 \
>          -machine q35,kernel-irqchip=split -cpu host   \
>          -k fr   \
>          -serial stdio   \
>          -net none \
>          -qmp unix:/tmp/qmp.socket,server,nowait \
>          -monitor telnet:127.0.0.1:5555,server,nowait \
>          -device pcie-root-port,id=root0,multifunction=on,chassis=0,addr=0xa \
>          -device pcie-root-port,id=root1,bus=pcie.0,chassis=1 \
>          -device pcie-root-port,id=root2,bus=pcie.0,chassis=2 \
>          -netdev 
tap,script=/root/bin/bridge.sh,downscript=no,id=hostnet1,vhost=on \
>          -device 
virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:6f:55:cc,bus=root2,failover=on 
\
>          /root/rhel-guest-image-8.0-1781.x86_64.qcow2
>
> Then the primary device can be hotplugged via
>   (qemu) device_add vfio-pci,host=5e:00.2,id=hostdev0,bus=root1,standby=net1


I guess this is the commandline on the migration destination, and as far as
I understand from this example, on the destination we (meaning libvirt or
higher level management application) must *not* include the assigned device
on the qemu commandline, but must instead hotplug the device later after the
guest CPUs have been restarted on the destination.

So if I'm understanding correctly, the idea is that on the migration source,
the device may have been hotplugged, or may have been included when qemu was
originally started. Then qemu automatically handles the unplug of the device
on the source, but it seems qemu does nothing on the destination, leaving
that up to libvirt or a higher layer to implement.

Then in order for this to work, libvirt (or OpenStack or oVirt or whoever)
needs to understand that the device in the libvirt config (it will still be
in the libvirt config, since from libvirt's POV it hasn't been unplugged):

1) shouldn't be included in the qemu commandline on the destination,

I don't believe that's the case.  The CLI args above are just illustrating
that it is now possible to *optionally* not specify the VFIO device on the
CLI. This is because previous versions of the patchset *always* required
the device on the CLI due to a circular dependancy in the CLI syntax. This
patch series version fixed that limitation, so now the VFIO device can be
cold plugged or hotplugged as desired.

I've mostly tested hotplugging but cold plugged should work as well.
2) will almost surely need to be replaced with a different device on the
destination (since it's almost certain that the destination won't have an
available device at the same PCI address)

Yes, the management application that triggers the migration will need to
pass in a new XML document to libvirt when starting the migration so that
we use the suitable new device on the target host.

Yes, that's how I expected it to work. In my tests the pci address was
the same on destination and source host but that was more by accident. I
think the libvirt XML on the destination just needs to have the pci
address of nic of the same type for it to work.
3) will probably need to be unbinded from the VF net driver (does this need
to happen before migration is finished? If we want to lower the probability
of a failure after we're already committed to the migration, then I think we
must, but libvirt isn't set up for that in any way).

Yes, so I think that's part of the 'partial' unplug I'm trying to
figure out add the moment.
4) will need to be hotplugged after the migration has finished *and* after
the guest CPUs have been restarted on the destination.

My understanding is that QEMU takes care of this.

So the re-plugging of the device on the destination is not in the v1
of the patches, which I failed to mention, my bad. I will sent out a v2
that has this part as well shortly. I added a runstate change handler
that is called on the destination when the run state changes from INMIGRATE
to something else. When the new state is RUNNING I hotplug the primary device.
a) there isn't anything in libvirt's XML grammar that allows us to signify a
device that is "present in the config but shouldn't be included in the
commandline"

I don't thin we need that.

b) someone will need to replace the device from the source with an
equivalent device on the destination in the libvirt XML. There are other
cases of management modifying the XML during migration (I think), but this
does point out that putting the "auto-unplug code into qemu isn't turning
this into a trivial

The mgmt app should pass the new device details in the XML when starting
migration. Shouldn't be a big deal as OpenStack already does that for
quite a few other parts of the config.

c) there is nothing in libvirt's migration logic that can cause a device to
be re-binded to vfio-pci prior to completion of a migration. Unless this is
added to libvirt (or the re-bind operation is passed off to the management
application), we will need to live with the possibility that hotplugging the
device will fail due to failed re-bind *after* we've committed to the
migration.

IIUC, we should be binding to vfio-pci during the prepare phase of the
migration, since that's when QEMU is started by libvirt on the target.

d) once the guest CPUs are restarted on the destination, [someone] (libvirt
or management) needs to hotplug the new device on the destination. (I'm
guessing that a hotplug can only be done while the guest CPUs are running;
correct me if this is wrong!)

I don't believe so, since we'll be able to cold plug it during prepare
phase.

I think I don't understand what happens during the prepare phase on
the destination. Need to look into that. But I think I had an error in
my logic that I need to plug the device from QEMU on the destination
side. You're saying we could just always cold plug it directly when
the VM is started. I think an exception would be when the guest was
migrated before we added the primary device on the source, so before
virtio feature negotiation.

regards,
Jens


reply via email to

[Prev in Thread] Current Thread [Next in Thread]