qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 0/2] implement the failover feature for assi


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [RFC PATCH 0/2] implement the failover feature for assigned network devices
Date: Mon, 8 Apr 2019 18:00:00 +0100
User-agent: Mutt/1.11.4 (2019-03-13)

* Jens Freimann (address@hidden) wrote:
> On Mon, Apr 08, 2019 at 10:16:50AM +0100, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (address@hidden) wrote:
> > > On Fri, Apr 05, 2019 at 09:56:29AM +0100, Dr. David Alan Gilbert wrote:
> > > > * Jens Freimann (address@hidden) wrote:
> > > > > On Fri, Mar 22, 2019 at 02:44:45PM +0100, Jens Freimann wrote:
> > > > > > This is another attempt at implementing the host side of the
> > > > > > net_failover concept
> > > > > > (https://www.kernel.org/doc/html/latest/networking/net_failover.html)
> > > > > >
> > > > > > The general idea is that we have a pair of devices, a vfio-pci and a
> > > > > > emulated device. Before migration the vfio device is unplugged and 
> > > > > > data
> > > > > > flows to the emulated device, on the target side another vfio-pci 
> > > > > > device
> > > > > > is plugged in to take over the data-path. In the guest the 
> > > > > > net_failover
> > > > > > module will pair net devices with the same MAC address.
> > > > > >
> > > > > > * In the first patch the infrastructure for hiding the device is 
> > > > > > added
> > > > > >  for the qbus and qdev APIs. A "hidden" boolean is added to the 
> > > > > > device
> > > > > >  state and it is set based on a callback to the standby device which
> > > > > >  registers itself for handling the assessment: "should the primary 
> > > > > > device
> > > > > >  be hidden?" by cross validating the ids of the devices.
> > > > > >
> > > > > > * In the second patch the virtio-net uses the API to hide the vfio
> > > > > >  device and unhides it when the feature is acked.
> > > > > >
> > > > > > Previous discussion: https://patchwork.ozlabs.org/cover/989098/
> > > > > >
> > > > > > To summarize concerns/feedback from previous discussion:
> > > > > > 1.- guest OS can reject or worse _delay_ unplug by any amount of 
> > > > > > time.
> > > > > >  Migration might get stuck for unpredictable time with unclear 
> > > > > > reason.
> > > > > >  This approach combines two tricky things, hot/unplug and migration.
> > > > > >  -> We can surprise-remove the PCI device and in QEMU we can do all
> > > > > >     necessary rollbacks transparent to management software. Will it 
> > > > > > be
> > > > > >     easy, probably not.
> > > >
> > > > This sounds 'fun' - bonus cases are things like what happens if the
> > > > guest gets rebooted somewhere during the process or if it's currently
> > > > sitting in the bios/grub/etc
> > > 
> > > Um, during which process? Guests are gradually fixed to support
> > > surprise removal well. Part of it is thunderbolt which makes
> > > it incredibly easy. Yes - bios/grub will need to learn to
> > > handle this well.
> > 
> > Ignoring the actual mechanism of the unplug itself; there are probably
> > loads of cases; e.g.
> > 
> >      running with both cards
> >      hot unplug real card
> >      start migration
> >      guest reboots
> >        Kernel sees only the virtio card
> >      migration completes
> >      hotadd the real card back
> > 
> > so the guest has to know to pair the real card even though it booted
> > with only the virtio card.
> 
> Maybe I misunderstand, but, when the 'real card' is added back after
> migration the net_failover driver in the guest will know to pair it
> with the virtio card because they have the same MAC address. Did you
> mean something else?

OK if it knows to do that.

> > I'm sure there are loads of other corners.
> 
> Probably yes.

Yeh, that was just my worry - just there's loads of this type of corner
around reboots.

Dave

> regards,
> Jens
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]