qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC] pseries: define coldplugged devices as "con


From: Michael Roth
Subject: Re: [Qemu-devel] [PATCH RFC] pseries: define coldplugged devices as "configured"
Date: Sun, 23 Aug 2015 14:08:30 -0500
User-agent: alot/0.3.6

Quoting Laurent Vivier (2015-08-14 02:46:49)
> 
> 
> On 14/08/2015 07:20, Bharata B Rao wrote:
> > On Thu, Aug 13, 2015 at 02:53:02PM +0200, Laurent Vivier wrote:
> >> When a device is hotplugged, attach() sets "configured" to
> >> false, waiting an action from the OS to configure it and then
> >> to call ibm,configure-connector. On ibm,configure-connector,
> >> the hypervisor sets "configured" to true.
> >>
> >> In case of coldplugged device, attach() sets "configured" to
> >> false, but firmware and OS never call the ibm,configure-connector
> >> in this case, so it remains set to false.
> >>
> >> It could be harmless, but when we unplug a device, hypervisor
> >> waits the device becomes configured because for it, a not configured
> >> device is a device being configured, so it waits the end of configuration
> >> to unplug it... and it never happens, so it is never unplugged.
> > 
> > Not true for at least logical DR device like CPU. I am able to cleanly
> > unplug a cold plugged CPU in the patchset I posted at:
> > 
> > https://lists.gnu.org/archive/html/qemu-ppc/2015-08/msg00041.html
> > 
> > And this is how the state transitions work for cold plugged CPU devices:
> > 
> > - Cold plugged CPU DRC is explicitly set with allocation_state=USABLE
> >   and isolation_state=UNISOLATED.
> > - device_del results in drck->detach() that just returns by setting
> >   drc->awaiting_release to true.
> > - Unplug notification is sent to guest.
> > - Guest comes back with set_indicator RTAS call for setting isolation_state
> >   to ISOLATED. set_isolation_state() sets drc->configured to false.
> > - Guest comes back again with set_indicator RTAS call for setting allocation
> >   state to UNUSABLE. set_allocation_state() finalizes the device removal by
> >   calling drck->detach()
> 
> It doesn't work for PCI, because (QEMU 2.4.0):
> 
> static int set_allocation_state(sPAPRDRConnector *drc,
>                                 sPAPRDRAllocationState state)
> ...
>     if (drc->type != SPAPR_DR_CONNECTOR_TYPE_PCI) {
> ...
>             drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
>                          drc->detach_cb_opaque, NULL);
> ...

Ok, that makes sense then:

the is_configured() checks were added due to a race specifically with
PCI devices: when we plug the device we hand control over to OS and set
state to unisolated as a result. The guest assumes 'interactive' hotplug
where it sets a slot back to isolated and waits for the user to actually
plug it in. Once plugged in, state is moved back to isolated, and guest
starts configuring device. We use a flag in guest drmgr invocation to skip
the wait, but it *still* does the change to isolated state. So there's an
extra unisolated->isolated->unisolated transition for PCI in guest code.

Because of that check, if management does a quick device_add+device_del,
there's a race where we mark the device as awaiting_release as soon as
the device_del comes in (even though device_add event might still be
getting processed by guest). That would fine normally, but in this state
a transition to isolated state results in the device getting immediately
finalized and then disappearing while the guest is trying to configure
it, so the extra transition in the PCI case races with device_del.

The is_configured() check removes that race window, and the check was
added in set_isolation(). 'logical' resources (lmb/cpu/phb) get
finalized via set_allocation() however, which is why they didn't appear
affected by this bug. And from what I can tell, cpu/lmb don't make extra
'isolated'/'unallocated' transitions, just the ones at the end unplug,
so the fact that we're missing the check in set_allocation() shouldn't
be a problem. Makes sense to set the configured flag appropriately for
those case as well though for consistency.

Reviewed-by: Michael Roth <address@hidden>

>     }
> 
> > - drck->detach() now calls drc->detach_cb() that truly releases the
> >   CPU resource by getting rid of vCPU thread in QEMU.
> 
> Laurent
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]