qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/4] net/virtio: add failover support


From: Eduardo Habkost
Subject: Re: [Qemu-devel] [PATCH 3/4] net/virtio: add failover support
Date: Mon, 3 Jun 2019 16:36:48 -0300
User-agent: Mutt/1.11.3 (2019-02-01)

On Mon, Jun 03, 2019 at 10:24:56AM +0200, Jens Freimann wrote:
> On Fri, May 31, 2019 at 06:47:48PM -0300, Eduardo Habkost wrote:
> > On Thu, May 30, 2019 at 04:56:45PM +0200, Jens Freimann wrote:
> > > On Tue, May 28, 2019 at 11:04:15AM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, May 21, 2019 at 10:45:05AM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Jens Freimann (address@hidden) wrote:
> > [...]
> > > > > > +    }
> > > > > > +    if (migration_in_setup(s) && !should_be_hidden && 
> > > > > > n->primary_dev) {
> > > > > > +        qdev_unplug(n->primary_dev, &err);
> > > > >
> > > > > Not knowing unplug well; can you just explain - is that device hard
> > > > > unplugged and it's gone by the time this function returns or is it 
> > > > > still
> > > > > hanging around for some indeterminate time?
> > > 
> > > Qemu will trigger an unplug request via pcie attention button in which 
> > > case
> > > there could be a delay by the guest operating system. We could give it 
> > > some
> > > amount of time and if nothing happens try surpise removal or handle the
> > > error otherwise.
> > 
> > I'm missing something here:
> > 
> > Isn't the whole point of the new device-hiding infrastructure to
> > prevent QEMU from closing the VFIO until migration ended
> > successfully?
> 
> No. The point of hiding it is to only add the VFIO (that is configured
> with the same MAC as the virtio-net device) until the
> VIRTIO_NET_F_STANDBY feature is negotiated. We don't want to expose to
> devices with the same MAC to guests who can't handle it.
> 
> > What exactly is preventing QEMU from closing the host VFIO device
> > after the guest OS has handled the unplug request?
> 
> We qdev_unplug() the VFIO device and want the virtio-net standby device to
> take over. If something goes wrong with unplug or
> migration in general we have to qdev_plug() the device back.
> 
> This series does not try to implement new functionality to close a
> device without freeing the resources.
> 
> From the discussion in this thread I understand that is what libvirt
> needs though. Something that will trigger the unplug from the
> guest but not free the devices resources in the host system (which is
> what qdev_unplug() does). Correct?

This is what I understand we need, but this is not what
qdev_unplug() does.

> 
> Why is it bad to fully re-create the device in case of a failed migration?

Bad or not, I thought the whole point of doing it inside QEMU was
to do something libvirt wouldn't be able to do (namely,
unplugging the device while not freeing resources).  If we are
doing something that management software is already capable of
doing, what's the point?

Quoting a previous message from this thread:

On Thu, May 30, 2019 at 02:09:42PM -0400, Michael S. Tsirkin wrote:
| > On Thu, May 30, 2019 at 07:00:23PM +0100, Dr. David Alan Gilbert wrote:
| > >  This patch series is very
| > > odd precisely because it's trying to do the unplug itself in the
| > > migration phase rather than let the management layer do it - so unless
| > > it's nailed down how to make sure that's really really bullet proof
| > > then we've got to go back and ask the question about whether we should
| > > really fix it so it can be done by the management layer.
| > > 
| > > Dave
| > 
| > management already said they can't because files get closed and
| > resources freed on unplug and so they might not be able to re-add device
| > on migration failure. We do it in migration because that is
| > where failures can happen and we can recover.


-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]