qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 2/5] acpi_piix4: Fix PCI hotplug race


From: Alex Williamson
Subject: Re: [Qemu-devel] [PATCH v2 2/5] acpi_piix4: Fix PCI hotplug race
Date: Tue, 10 Apr 2012 12:38:09 -0600

On Tue, 2012-04-10 at 18:45 +0300, Michael S. Tsirkin wrote:
> On Tue, Apr 10, 2012 at 09:35:39AM -0600, Alex Williamson wrote:
> > On Tue, 2012-04-10 at 18:19 +0300, Michael S. Tsirkin wrote:
> > > On Tue, Apr 10, 2012 at 09:14:00AM -0600, Alex Williamson wrote:
> > > > On Sun, 2012-04-08 at 01:57 -0300, Marcelo Tosatti wrote:
> > > > > On Thu, Apr 05, 2012 at 11:07:15AM -0600, Alex Williamson wrote:
> > > > > > As Michael Tsirkin demonstrated, current PCI hotplug is vulnerable
> > > > > > to a few races.  The first is a race with other hotplug operations
> > > > > > because we clear the up & down registers at each event.  If a new
> > > > > > event comes before the last is processed, up/down is cleared and
> > > > > > the event is lost.
> > > > > > 
> > > > > > To fix this for the down register, we create a life cycle for
> > > > > > the event request that starts with the hot unplug request in
> > > > > > piix4_device_hotplug() and ends when the device is ejected.
> > > > > > This allows us to mask and clear individual bits, preserving them
> > > > > > against races.  For the up register, we have no clear end point
> > > > > > for when the event is finished.  We could modify the BIOS to
> > > > > > acknowledge the bit and clear it, but this creates BIOS compatibiliy
> > > > > > issues without offering a complete solution.  Instead we note that
> > > > > > gratuitous ACPI device checks are not harmful, which allows us to
> > > > > > issue a device check for every slot.  We know which slots are 
> > > > > > present
> > > > > > and we know which slots are hotpluggable, so we can easily reduce
> > > > > > this to a more manageable set for the guest.
> > > > > > 
> > > > > > The other race Michael noted was that an unplug request followed
> > > > > > by reset may also lose the eject notification, which may also
> > > > > > result in the eject request being lost which a subsequent add
> > > > > > or remove.  Once we're in reset, the device is unused and we can
> > > > > > flush the queue of device removals ourselves.  Previously if a
> > > > > > device_del was issued to a guest without ACPI PCI hotplug support,
> > > > > > it was necessary to shutdown the guest to recover the device.
> > > > > > With this, a guest reboot is sufficient.
> > > > > > 
> > > > > > Signed-off-by: Alex Williamson <address@hidden>
> > > > > > ---
> > > > > > 
> > > > > >  hw/acpi_piix4.c |   74 
> > > > > > +++++++++++++++++++++++++++++++++++++++----------------
> > > > > >  1 files changed, 53 insertions(+), 21 deletions(-)
> > > > > > 
> > > > > > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> > > > > > index 5e8b261..0e7af51 100644
> > > > > > --- a/hw/acpi_piix4.c
> > > > > > +++ b/hw/acpi_piix4.c
> > > > > > @@ -48,7 +48,7 @@
> > > > > >  #define PIIX4_PCI_HOTPLUG_STATUS 2
> > > > > >  
> > > > > >  struct pci_status {
> > > > > > -    uint32_t up;
> > > > > > +    uint32_t up; /* deprecated, maintained for migration 
> > > > > > compatibility */
> > > > > >      uint32_t down;
> > > > > >  };
> > > > > >  
> > > > > > @@ -70,6 +70,7 @@ typedef struct PIIX4PMState {
> > > > > >      /* for pci hotplug */
> > > > > >      struct pci_status pci0_status;
> > > > > >      uint32_t pci0_hotplug_enable;
> > > > > > +    uint32_t pci0_slot_device_present;
> > > > > 
> > > > > Why can't you use the "up" field for this? Fail to see the point
> > > > > of the new variable.
> > > > 
> > > > I suppose we could.  Initially I was creating the present device bitmap
> > > > dynamically, but walking the device tree on every access seemed
> > > > excessive versus updating it runtime.  Since there's no need to migrate
> > > > this and we don't want to have it clobbered by an incoming migration, I
> > > > didn't think to re-use the "up" field.  We could do it, but it would
> > > > mean introducing a post_load function that effectively just calls
> > > > piix4_update_hotplug to recreate it, which seems like an unnecessary
> > > > effort to avoid using an extra 4 bytes of memory.
> > > 
> > > It's probably harmless if we do let it be clobbered by migration
> > > though: worst case we lose an event and that might have
> > > happened before migration :)
> > 
> > Perhaps that's another way to manage it, just let it be a lazy
> > accumulation of anything that has been hotadded.  That makes debug hard
> > though because we can't know what should be set without looking that the
> > entire history of the vm, versus something like "device present", which
> > we can verify at any point in time.  Thanks,
> > 
> > Alex
> 
> Good point. I think Marcelo's comment can be addressed
> if you rename up to up_ignored, old_up, up_legacy or something
> like that.

Easy enough to rename .up to .old_up.  Marcelo, are you looking for more
than that?  Thanks,

Alex






reply via email to

[Prev in Thread] Current Thread [Next in Thread]