qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 0/5] hotplug: fix premature rebinding of VFI


From: Alex Williamson
Subject: Re: [Qemu-devel] [RFC PATCH 0/5] hotplug: fix premature rebinding of VFIO devices to host
Date: Thu, 29 Jun 2017 13:44:18 -0600

On Thu, 29 Jun 2017 14:50:15 -0400
Laine Stump <address@hidden> wrote:

> On 06/28/2017 08:24 PM, Michael Roth wrote:
> > Hi everyone. Hoping to get some feedback on this approach, or some
> > alternatives proposed below, to the following issue:
> > 
> > Currently libvirt immediately attempts to rebind a managed device back to 
> > the
> > host driver when it receives a DEVICE_DELETED event from QEMU. This is
> > problematic for 2 reasons:
> > 
> > 1) If multiple devices from a group are attached to a guest, this can move
> >    the group into a "non-viable" state where some devices are assigned to
> >    the host and some to the guest.  
> 
> Since we don't support hotplug with managed='yes' of individual (or
> multiple) functions of a multifunction host device, I don't know that
> it's very useful to support hot *un*plug of it - it would only be useful
> if the multi-function device were present in the guest when it was
> started, and then was hot-unplugged later. And this is all a lot of
> extra complexity, though, so it would be useful to know what are the
> scenarios where it would actually be used (i.e. is this a legitimate
> need, or just an interesting exercise?)

This doesn't make sense to me, since when do we not support hotplug
with managed='yes' and how is it prevented?  Also, let's just not talk
about multifunction, a multifunction device does not imply a shared
group, nor does a shared group imply multifunction.  So is it hotplug
of a device which is in a shared group that is not supported, and if so
how?  I think libvirt tries to do the hot-add, but it hits the
non-viable group when it gives it to QEMU.  On hot-remove, I'm pretty
sure libvirt just lets the host crash into the ground by re-binding the
device to the host driver.  If we don't want to support it, that's one
thing, but the current model is more just neglectful than unsupported.


> > 2) When QEMU emits the DEVICE_DELETED event, there's still a "finalize" 
> > phase
> >    where additional cleanup occurs. In most cases libvirt can ignore this
> >    cleanup, but in the case of VFIO devices this is where closing of a VFIO
> >    group FD occurs, and failing to wait before rebinding the device to the
> >    host driver can result in unexpected behavior. In the case of powernv
> >    hosts at least, this can lead to a host driver crashing due to the 
> > default
> >    DMA windows not having been fully-restored yet. The window between this 
> > is
> >    and the initial DEVICE_DELETED seems to be ~6 seconds in practice. We've
> >    seen host dumps with Mellanox CX4 VFs being rebound to host driver during
> >    this period (on powernv hosts).  
> 
> I agree with Dan that the situation described here should be considered
> a qemu bug - according to my understanding (from back at the time
> DEVICE_DELETED was added to qemu (I think at libvirt's request) qemu
> should never emit the DEVICE_DELETED event until *everything* related to
> the device is finished - that was the whole point of adding the event in
> the first palce. Covering up this bug with a bunch of extra libvirt
> complexity is just creating the potential for even more bugs in the more
> complex code.

Agree, but ISTR not everyone thinks that way.  I don't remember the
opposing viewpoint though.  Thanks,

Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]