[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter
From: |
Cédric Le Goater |
Subject: |
Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter |
Date: |
Wed, 2 Oct 2019 16:47:56 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 |
On 02/10/2019 16:21, Greg Kurz wrote:
> On Wed, 2 Oct 2019 11:02:45 +1000
> David Gibson <address@hidden> wrote:
>
>> On Tue, Oct 01, 2019 at 06:56:28PM +0200, Greg Kurz wrote:
>>> On Tue, 1 Oct 2019 13:56:10 +0200
>>> Cédric Le Goater <address@hidden> wrote:
>>>
>>>> On 01/10/2019 13:06, Greg Kurz wrote:
>>>>> On Tue, 1 Oct 2019 10:57:22 +0200
>>>>> Cédric Le Goater <address@hidden> wrote:
>>>>>
>>>>>> When vCPUs are hotplugged, they are added to the QEMU CPU list before
>>>>>> being fully realized. This can crash the XIVE presenter because the
>>>>>> 'tctx' pointer is not necessarily initialized when looking for a
>>>>>> matching target.
>>>>>>
>>>>>
>>>>> Ouch... :-\
>>>>>
>>>>>> These vCPUs are not valid targets for the presenter. Skip them.
>>>>>>
>>>>>
>>>>> This likely fixes this specific issue, but more generally, this
>>>>> seems to indicate that using CPU_FOREACH() is rather fragile.
>>>>>
>>>>> What about tracking XIVE TM contexts with a QLIST in xive.c ?
>>>>
>>>> This is a good idea.
>>>>
>>>> On HW, the thread interrupt contexts belong to the XIVE presenter
>>>> subengine. This is the logic doing the CAM line matching to find
>>>> a target for an event notification. So we should model the context
>>>> list below the XiveRouter in QEMU which models both router and
>>>> presenter subengines. We have done without a presenter model for
>>>> the moment and I don't think we will need to introduce one.
>>>>
>>>> This would be a nice improvements of my patchset adding support
>>>> for xive escalations and better support of multi chip systems.
>>>> I have introduced a PNV_CHIP_CPU_FOREACH in this patchset which
>>>> would become useless with a list of tctx under the XIVE interrupt
>>>> controller, XiveRouter, SpaprXive, PnvXive.
>>>>
>>>
>>> I agree. It makes more sense to have the list below the XiveRouter,
>>> rather than relying on CPU_FOREACH(), which looks a bit weird from
>>> a device emulation code perspective.
>>
>> That does sound like a better idea long term. However, for now, I
>> think the NULL check is a reasonable way of fixing the real error
>> we're hitting, so I've applied the patch here.
>>
>
> Fair enough.
>
> Reviewed-by: Greg Kurz <address@hidden>
>
>> Future cleanups to a different approach remain welcome, of course.
>>
>
> I've started to look. This should simplify Cedric's "add XIVE support
> for KVM guests" series, but I'll wait for your massive cleanup series
> to get merged first.
This is a combo patchset.
These are prereqs fixes related to the presenter and how CAM lines
are scanned. This is in direct relation with the CPU_FOREACH() issue.
ppc/xive: Introduce a XivePresenter interface
ppc/xive: Implement the XivePresenter interface
ppc/pnv: Introduce a PNV_CHIP_CPU_FOREACH() helper
ppc/pnv: Introduce a pnv_xive_is_cpu_enabled() helper
ppc/xive: Introduce a XiveFabric interface
ppc/pnv: Implement the XiveFabric interface
ppc/spapr: Implement the XiveFabric interface
ppc/xive: Use the XiveFabric and XivePresenter interfaces
These are for interrupt resend (escalation). To be noted, the removal
of the get_tctx() XiveRouter handler which has some relation with
the previous subserie.
ppc/xive: Extend the TIMA operation with a XivePresenter parameter
ppc/pnv: Clarify how the TIMA is accessed on a multichip system
ppc/xive: Move the TIMA operations to the controller model
ppc/xive: Remove the get_tctx() XiveRouter handler
ppc/xive: Introduce a xive_tctx_ipb_update() helper
ppc/xive: Introduce helpers for the NVT id
ppc/xive: Synthesize interrupt from the saved IPB in the NVT
This is a bug :
ppc/pnv: Remove pnv_xive_vst_size() routine
ppc/pnv: Dump the XIVE NVT table
ppc/pnv: Skip empty slots of the XIVE NVT table
This is a model for the block id configuration and better support
for multichip systems :
ppc/pnv: Introduce a pnv_xive_block_id() helper
ppc/pnv: Extend XiveRouter with a get_block_id() handler
Misc improvements :
ppc/pnv: Quiesce some XIVE errors
ppc/xive: Introduce a xive_os_cam_decode() helper
ppc/xive: Check V bit in TM_PULL_POOL_CTX
ppc/pnv: Improve trigger data definition
ppc/pnv: Use the EAS trigger bit when triggering an interrupt from PSI
I should move some in front to have them merged before hand if
possible. They have been on the list for 3 months.
>>>> Next step would be to get rid of the tctx->cs pointer. In my latest
>>>> patches, it is only used to calculate the HW CAM line.
>>>>
>>>> There might be some consequences on the object hierarchy and it will
>>>> break migration.
>>>>
>>>
>>> This could break if the contexts were devices sitting in a bus, which
>>> isn't the case here. I'll try to come up with a proposal for spapr,
>>> and we can work out the changes on your recent XIVE series for pnv.
>>>
>>>> Thanks,
>>>>
>>>> C.
>>>>
>>>>>
>>>>> ================================================================================
>>>>> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
>>>>> index 6d38755f8459..89b9ef7f20b1 100644
>>>>> --- a/include/hw/ppc/xive.h
>>>>> +++ b/include/hw/ppc/xive.h
>>>>> @@ -319,6 +319,8 @@ typedef struct XiveTCTX {
>>>>> qemu_irq os_output;
>>>>>
>>>>> uint8_t regs[XIVE_TM_RING_COUNT * XIVE_TM_RING_SIZE];
>>>>> +
>>>>> + QTAILQ_ENTRY(XiveTCTX) next;
>>>>> } XiveTCTX;
>>>>>
>>>>> /*
>>>>> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
>>>>> index b7417210d817..f7721c711041 100644
>>>>> --- a/hw/intc/xive.c
>>>>> +++ b/hw/intc/xive.c
>>>>> @@ -568,6 +568,8 @@ static void xive_tctx_reset(void *dev)
>>>>> ipb_to_pipr(tctx->regs[TM_QW3_HV_PHYS + TM_IPB]);
>>>>> }
>>>>>
>>>>> +static QTAILQ_HEAD(, XiveTCTX) xive_tctx_list =
>>>>> QTAILQ_HEAD_INITIALIZER(xive_tctx_list);
>>>>> +
>>>>> static void xive_tctx_realize(DeviceState *dev, Error **errp)
>>>>> {
>>>>> XiveTCTX *tctx = XIVE_TCTX(dev);
>>>>> @@ -609,10 +611,14 @@ static void xive_tctx_realize(DeviceState *dev,
>>>>> Error **errp)
>>>>> }
>>>>>
>>>>> qemu_register_reset(xive_tctx_reset, dev);
>>>>> + QTAILQ_INSERT_HEAD(&xive_tctx_list, tctx, next);
>>>>> }
>>>>>
>>>>> static void xive_tctx_unrealize(DeviceState *dev, Error **errp)
>>>>> {
>>>>> + XiveTCTX *tctx = XIVE_TCTX(dev);
>>>>> +
>>>>> + QTAILQ_REMOVE(&xive_tctx_list, tctx, next);
>>>>> qemu_unregister_reset(xive_tctx_reset, dev);
>>>>> }
>>>>>
>>>>> @@ -1385,15 +1391,14 @@ static bool xive_presenter_match(XiveRouter
>>>>> *xrtr, uint8_t format,
>>>>> bool cam_ignore, uint8_t priority,
>>>>> uint32_t logic_serv, XiveTCTXMatch
>>>>> *match)
>>>>> {
>>>>> - CPUState *cs;
>>>>> + XiveTCTX *tctx;
>>>>>
>>>>> /*
>>>>> * TODO (PowerNV): handle chip_id overwrite of block field for
>>>>> * hardwired CAM compares
>>>>> */
>>>>>
>>>>> - CPU_FOREACH(cs) {
>>>>> - XiveTCTX *tctx = xive_router_get_tctx(xrtr, cs);
>>>>> + QTAILQ_FOREACH(tctx, &xive_tctx_list, next) {
>>>>> int ring;
>>>>>
>>>>> /*
>>>>> ================================================================================
>>>>>
>>>>>> Signed-off-by: Cédric Le Goater <address@hidden>
>>>>>> ---
>>>>>> hw/intc/xive.c | 8 ++++++++
>>>>>> 1 file changed, 8 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
>>>>>> index b7417210d817..29df06df1136 100644
>>>>>> --- a/hw/intc/xive.c
>>>>>> +++ b/hw/intc/xive.c
>>>>>> @@ -1396,6 +1396,14 @@ static bool xive_presenter_match(XiveRouter
>>>>>> *xrtr, uint8_t format,
>>>>>> XiveTCTX *tctx = xive_router_get_tctx(xrtr, cs);
>>>>>> int ring;
>>>>>>
>>>>>> + /*
>>>>>> + * Skip partially initialized vCPUs. This can happen when
>>>>>> + * vCPUs are hotplugged.
>>>>>> + */
>>>>>> + if (!tctx) {
>>>>>> + continue;
>>>>>> + }
>>>>>> +
>>>>>> /*
>>>>>> * HW checks that the CPU is enabled in the Physical Thread
>>>>>> * Enable Register (PTER).
>>>>>
>>>>
>>>
>>
>
- [PATCH] spapr/xive: skip partially initialized vCPUs in presenter, Cédric Le Goater, 2019/10/01
- Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter, Greg Kurz, 2019/10/01
- Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter, Cédric Le Goater, 2019/10/01
- Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter, Greg Kurz, 2019/10/01
- Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter, David Gibson, 2019/10/01
- Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter, Greg Kurz, 2019/10/02
- Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter,
Cédric Le Goater <=
- Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter, David Gibson, 2019/10/02
- Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter, Cédric Le Goater, 2019/10/03