qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SM


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SMI feature
Date: Tue, 17 Jan 2017 14:46:05 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0

On 01/17/17 14:20, Igor Mammedov wrote:
> On Tue, 17 Jan 2017 13:06:13 +0100
> Laszlo Ersek <address@hidden> wrote:
> 
>> On 01/17/17 12:21, Igor Mammedov wrote:
>>> On Mon, 16 Jan 2017 21:46:55 +0100
>>> Laszlo Ersek <address@hidden> wrote:
>>>   
>>>> On 01/13/17 14:09, Igor Mammedov wrote:  
>>>>> On Thu, 12 Jan 2017 19:24:45 +0100
>>>>> Laszlo Ersek <address@hidden> wrote:
>>>>>     
>>>>>> The generic edk2 SMM infrastructure prefers
>>>>>> EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each processor. 
>>>>>> If
>>>>>> Trigger() only brings the current processor into SMM, then edk2 handles 
>>>>>> it
>>>>>> in the following ways:
>>>>>>
>>>>>> (1) If Trigger() is executed by the BSP (which is guaranteed before
>>>>>>     ExitBootServices(), but is not necessarily true at runtime), then:
>>>>>>
>>>>>>     (a) If edk2 has been configured for "traditional" SMM 
>>>>>> synchronization,
>>>>>>         then the BSP sends directed SMIs to the APs with APIC delivery,
>>>>>>         bringing them into SMM individually. Then the BSP runs the SMI
>>>>>>         handler / dispatcher.
>>>>>>
>>>>>>     (b) If edk2 has been configured for "relaxed" SMM synchronization,
>>>>>>         then the APs that are not already in SMM are not brought in, and
>>>>>>         the BSP runs the SMI handler / dispatcher.
>>>>>>
>>>>>> (2) If Trigger() is executed by an AP (which is possible after
>>>>>>     ExitBootServices(), and can be forced e.g. by "taskset -c 1
>>>>>>     efibootmgr"), then the AP in question brings in the BSP with a
>>>>>>     directed SMI, and the BSP runs the SMI handler / dispatcher.
>>>>>>
>>>>>> The smaller problem with (1a) and (2) is that the BSP and AP
>>>>>> synchronization is slow. For example, the "taskset -c 1 efibootmgr"
>>>>>> command from (2) can take more than 3 seconds to complete, because
>>>>>> efibootmgr accesses non-volatile UEFI variables intensively.
>>>>>>
>>>>>> The larger problem is that QEMU's current behavior diverges from the
>>>>>> behavior usually seen on physical hardware, and that keeps exposing
>>>>>> obscure corner cases, race conditions and other instabilities in edk2,
>>>>>> which generally expects / prefers a software SMI to affect all CPUs at
>>>>>> once.
>>>>>>
>>>>>> Therefore introduce the "broadcast SMI" feature that causes QEMU to 
>>>>>> inject
>>>>>> the SMI on all VCPUs.
>>>>>>
>>>>>> While the original posting of this patch
>>>>>> <http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg05658.html>
>>>>>> only intended to speed up (2), based on our recent "stress testing" of 
>>>>>> SMM
>>>>>> this patch actually provides functional improvements.
>>>>>>
>>>>>> Cc: "Michael S. Tsirkin" <address@hidden>
>>>>>> Cc: Gerd Hoffmann <address@hidden>
>>>>>> Cc: Igor Mammedov <address@hidden>
>>>>>> Cc: Paolo Bonzini <address@hidden>
>>>>>> Signed-off-by: Laszlo Ersek <address@hidden>
>>>>>> Reviewed-by: Michael S. Tsirkin <address@hidden>
>>>>>> ---
>>>>>>
>>>>>> Notes:
>>>>>>     v6:
>>>>>>     - no changes, pick up Michael's R-b
>>>>>>     
>>>>>>     v5:
>>>>>>     - replace the ICH9_LPC_SMI_F_BROADCAST bit value with the
>>>>>>       ICH9_LPC_SMI_F_BROADCAST_BIT bit position (necessary for
>>>>>>       DEFINE_PROP_BIT() in the next patch)
>>>>>>
>>>>>>  include/hw/i386/ich9.h |  3 +++
>>>>>>  hw/isa/lpc_ich9.c      | 10 +++++++++-
>>>>>>  2 files changed, 12 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
>>>>>> index da1118727146..18dcca7ebcbf 100644
>>>>>> --- a/include/hw/i386/ich9.h
>>>>>> +++ b/include/hw/i386/ich9.h
>>>>>> @@ -250,4 +250,7 @@ Object *ich9_lpc_find(void);
>>>>>>  #define ICH9_SMB_HST_D1                         0x06
>>>>>>  #define ICH9_SMB_HOST_BLOCK_DB                  0x07
>>>>>>  
>>>>>> +/* bit positions used in fw_cfg SMI feature negotiation */
>>>>>> +#define ICH9_LPC_SMI_F_BROADCAST_BIT            0
>>>>>> +
>>>>>>  #endif /* HW_ICH9_H */
>>>>>> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
>>>>>> index 376b7801a42c..ced6f803a4f2 100644
>>>>>> --- a/hw/isa/lpc_ich9.c
>>>>>> +++ b/hw/isa/lpc_ich9.c
>>>>>> @@ -437,7 +437,15 @@ static void ich9_apm_ctrl_changed(uint32_t val, 
>>>>>> void *arg)
>>>>>>  
>>>>>>      /* SMI_EN = PMBASE + 30. SMI control and enable register */
>>>>>>      if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) {
>>>>>> -        cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI);
>>>>>> +        if (lpc->smi_negotiated_features &
>>>>>> +            (UINT64_C(1) << ICH9_LPC_SMI_F_BROADCAST_BIT)) {
>>>>>> +            CPUState *cs;
>>>>>> +            CPU_FOREACH(cs) {
>>>>>> +                cpu_interrupt(cs, CPU_INTERRUPT_SMI);
>>>>>> +            }    
>>>>> Shouldn't CPUs with default SMI base be excluded from broadcast?    
>>>>
>>>> I don't think so; as far as I recall, in edk2 the initial SMM entry code
>>>> -- before SMBASE relocation -- accommodates all VCPUs entering the same
>>>> code area for execution. The VCPUs are told apart from each other by
>>>> Initial APIC ID (that is, "who am I"), which is used as an index or
>>>> search criterion into a global shared array. Once they find their
>>>> respective private data blocks, the VCPUs don't step on each other's toes. 
>>>>  
>>> That's what I'm not sure.
>>> If broadcast SMI is sent before relocation all CPUs will use
>>> the same SMBASE and as result save their state in the same
>>> memory (even if it's state after RESET/POWER ON) racing/overwriting
>>> each other's saved state  
>>
>> Good point!
>>
>>> and then on exit from SMI they all will restore
>>> whatever state that race produced so it seems plain wrong thing to do.
>>>
>>> Are you sure that edk2 does broadcast for SMI initialization or does it
>>> using directed SMIs?  
>>
>> Hmmm, you are right. The SmmRelocateBases() function in
>> "UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c" first relocates SMBASE for
>> each individual AP in succession, by sending SMI IPIs to them, one by
>> one. Then it does the same for the BSP, by sending itself a similar SMI IPI.
>>
>> The above QEMU code is only exercised much later, after the SMBASE
>> relocation, with the SMM stack up and running. It is used when a
>> (preferably broadcast) SMI is triggered for firmware services (for
>> example, the UEFI variable services).
>>
>> So, you are right.
>>
>> Considering edk2 only, the difference shouldn't matter -- when this code
>> is invoked (via an IO port write), the SMBASEs will all have been relocated.
>>
>> Also, I've been informed that on real hardware, writing to APM_CNT
>> triggers an SMI on all CPUs, regardless of the individual SMBASE values.
>> So I believe the above code should be closest to real hardware, and the
>> pre-patch code was only written in unicast form for SeaBIOS's sake.
>>
>> I think the code is safe above. If the guest injects an SMI via APM_CNT
>> after negotiating SMI broadcast, it should have not left any VCPUs
>> without an SMI handler by that time. edk2 / OVMF conforms to this. In
>> the future we can tweak this further, if necessary; we have 63 more
>> bits, and we'll be able to reject invalid combinations of bits.
>>
>> Do you feel that we should skip VCPUs whose SMBASE has not been changed
>> from the default?
>>
>> If so, doesn't that run the risk of missing a VCPU that has an actual
>> SMI handler installed, and it just happens to be placed at the default
>> SMBASE location?
>>
>> ... I wouldn't mind skipping VCPUs with apparently unrelocated SMBASEs,
>> but I think (a) that's not what real hardware does, and (b) it is risky
>> if a VCPU's actual SMI handler has been installed while keeping the
>> default SMBASE value.
>>
>> What do you think?
> (a) probably doesn't consider hotplug, so it's totally fine for the case
> where firmware is the only one who wakes up/initializes CPUs.
> however consider 2 CPU hotplugged and then broadcast SMI happens
> to serve some other task (if there isn't unpark 'feature').
> Even if real hw does it, it's behavior not defined by SDM with possibly
> bad consequences.
> 
> (b) alone it's risky so I'd skip based on combination of
>  
>  if (default SMBASE & CPU is in reset state)
>     skip;
> 
> that should be safe and it leaves open possibility to avoid adding
> unpark 'feature' to CPU.

The above attributes ("SMBASE" and "CPU is in reset state") are specific
to x86 VCPUs. Should I use some dynamic QOM casting for this? Like the
X86_CPU() macro?

Can I assume here that the macro will never return NULL? (I think so,
LPC is only used with x86.)

... I guess SMBASE can be accessed as

  X86_CPU(cs)->env.smbase

How do I figure out if the CPU is in reset state ("waiting for first
INIT") though? Note that this state should be distinguished from simply
"halted" (i.e., after a HLT). After a HLT, the SMI should be injected
and delivered; see commit a9bad65d2c1f. Edk2 regularly uses HLT to send
APs to sleep.

Thanks
Laszlo

> 
>>
>> Thanks!
>> Laszlo
>>
>>>   
>>>>
>>>> Thanks
>>>> Laszlo
>>>>  
>>>>>     
>>>>>> +        } else {
>>>>>> +            cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI);
>>>>>> +        }
>>>>>>      }
>>>>>>  }
>>>>>>      
>>>>>     
>>>>  
>>>   
>>
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]