qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH for-2.12 v3 1/3] spapr/rtas: disable the decreme


From: Cédric Le Goater
Subject: Re: [Qemu-devel] [PATCH for-2.12 v3 1/3] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
Date: Wed, 22 Nov 2017 19:55:14 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

On 11/22/2017 03:33 AM, David Gibson wrote:
> On Mon, Nov 20, 2017 at 11:03:45AM +0100, Cédric Le Goater wrote:
>> When a CPU is stopped with the 'stop-self' RTAS call, its state
>> 'halted' is switched to 1 and, in this case, the MSR is not taken into
>> account anymore in the cpu_has_work() routine. Only the pending
>> hardware interrupts are checked with their LPCR:PECE* enablement bit.
>>
>> If the DECR timer fires after 'stop-self' is called and before the CPU
>> 'stop' state is reached, the nearly-dead CPU will have some work to do
>> and the guest will crash. This case happens very frequently with the
>> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
>> occasionally fired but after 'stop' state, so no work is to be done
>> and the guest survives.
>>
>> I suspect there is a race between the QEMU mainloop triggering the
>> timers and the TCG CPU thread but I could not quite identify the root
>> cause. To be safe, let's disable in the LPCR all the exceptions which
>> can cause an exit while the CPU is in power-saving mode and reenable
>> them when the CPU is started.
>>
>> For this purpose, we introduce a little helper routine to calculate
>> the PECE bits for a processor variant. We could also use the mask
>> value LPCR_PECE_L_MASK for the P8 and P9 processors. bit 47 and 48 are
>> reserved on P7 but it is still compatible.
>>
>> Signed-off-by: Cédric Le Goater <address@hidden>
> 
> I'm not thrilled about addressing this without 100% knowing what's
> going on, 

me either :/ I have spent hours, days, on QEMU logs trying to catch 
a possible race in the state machine of the CPUs and didn't find it.
I need a better understanding of the internals. 

> but this seems like a sensible change in any case, so I'm ok
> with applying something like this.
>
> A detail however..
> 
> [snip]
>>  #if !defined(CONFIG_USER_ONLY)
>> +
>> +target_ulong cpu_ppc_papr_pece_bits(CPUPPCState *env)
>> +{
>> +    switch (env->mmu_model) {
>> +    case POWERPC_MMU_3_00:
>> +        return LPCR_PDEE | LPCR_HDEE | LPCR_EEE | LPCR_DEE | LPCR_OEE;
>> +    default:
>> +        /* P7 and P8 has slightly different PECE bits, mostly because P8 
>> adds
>> +         * bit 47 and 48 which are reserved on P7. Here we set them all, 
>> which
>> +         * will work as expected for both implementations
>> +         */
>> +        return LPCR_P8_PECE0 | LPCR_P8_PECE1 | LPCR_P8_PECE2 | 
>> LPCR_P8_PECE3 |
>> +            LPCR_P8_PECE4;
>> +    }
>> +}
> 
> ..since we're working in this area, might as well clean up this
> inappropriate use of mmu_model.  Two options which I'd be ok with:
> 
> 1) Add a pece_bits field to the PowerPCCPUClass, correctly initialized
> for the various processors.
> 
> 2) A similar helper but using ppc_check_compat() to check the arch
> level, instead of using env->mmu_model.
> 

OK. 

Thanks,

C.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]