qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation


From: Lai Jiangshan
Subject: [Qemu-devel] [PATCH 1/1 V6] qemu-kvm: fix improper nmi emulation
Date: Tue, 18 Oct 2011 00:00:06 +0800
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc14 Thunderbird/3.1.4

On 10/17/2011 05:49 PM, Avi Kivity wrote:
> On 10/17/2011 11:40 AM, Lai Jiangshan wrote:
>>>>
>>>
>>> LINT1 may have been programmed as a level -triggered interrupt instead
>>> of edge triggered (NMI or interrupt).  We can use the ioctl argument for
>>> the level (and pressing the NMI button needs to pulse the level to 1 and
>>> back to 0).
>>>
>>
>> Hi, Avi, Jan,
>>
>> Which approach you prefer to?
>> I need to know the result before wasting too much time to respin
>> the approach.
> 
> Yes, sorry about the slow and sometimes conflicting feedback.
> 
>> 1) Fix KVM_NMI emulation approach  (which is v3 patchset)
>>      - It directly fixes the problem and matches the
>>        real hard ware more, but it changes KVM_NMI bahavior.
>>      - Require both kernel-site and userspace-site fix.
>>
>> 2) Get the LAPIC state from kernel irqchip, and inject NMI if it is allowed
>>    (which is v4 patchset)
>>      - Simple, don't changes any kernel behavior.
>>      - Only need the userspace-site fix
>>
>> 3) Add KVM_SET_LINT1 approach (which is v5 patchset)
>>      - don't changes the kernel's KVM_NMI behavior.
>>      - much complex
>>      - Require both kernel-site and userspace-site fix.
>>      - userspace-site should also handle the !KVM_SET_LINT1
>>        condition, it uses all the 2) approach' code. it means
>>        this approach equals the 2) approach + KVM_SET_LINT1 ioctl.
>>
>> This is an urgent bug of us, we need to settle it down soo
> 
> While (1) is simple, it overloads a single ioctl with two meanings,
> that's not so good.
> 
> Whether we do (1) or (3), we need (2) as well, for older kernels.
> 
> So I recommend first focusing on (2) and merging it, then doing (3).
> 
> (note an additional issue with 3 is whether to make it a vm or vcpu
> ioctl - we've been assuming vcpu ioctl but it's not necessarily the best
> choice).
> 

It is the 2) approach.
It only changes the user space site, the kernel site is not touched.
It is changed from previous v4 patch, fixed problems found by Jan.
----------------------------

From: Lai Jiangshan <address@hidden>

Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is maskied in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.

With this patch, inject-nmi request is handled as follows.

- When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
  interrupt.
- When in-kernel irqchip is enabled, get the in-kernel LAPIC states
  and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
  delivering the NMI directly. (Suggested by Jan Kiszka)

Changed from old version:
  re-implement it by the Jan's suggestion.
  fix the race found by Jan.

Signed-off-by: Lai Jiangshan <address@hidden>
Reported-by: Kenji Kaneshige <address@hidden>
---
 hw/apic.c |   33 +++++++++++++++++++++++++++++++++
 hw/apic.h |    1 +
 monitor.c |    6 +++++-
 3 files changed, 39 insertions(+), 1 deletions(-)
diff --git a/hw/apic.c b/hw/apic.c
index 69d6ac5..922796a 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -205,6 +205,39 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
     }
 }
 
+static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
+
+static void kvm_irqchip_deliver_nmi(void *p)
+{
+    APICState *s = p;
+    struct kvm_lapic_state klapic;
+    uint32_t lvt;
+
+    kvm_get_lapic(s->cpu_env, &klapic);
+    lvt = kapic_reg(&klapic, 0x32 + APIC_LVT_LINT1);
+
+    if (lvt & APIC_LVT_MASKED) {
+        return;
+    }
+
+    if (((lvt >> 8) & 7) != APIC_DM_NMI) {
+        return;
+    }
+
+    kvm_vcpu_ioctl(s->cpu_env, KVM_NMI);
+}
+
+void apic_deliver_nmi(DeviceState *d)
+{
+    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+
+    if (kvm_irqchip_in_kernel()) {
+        run_on_cpu(s->cpu_env, kvm_irqchip_deliver_nmi, s);
+    } else {
+        apic_local_deliver(s, APIC_LVT_LINT1);
+    }
+}
+
 #define foreach_apic(apic, deliver_bitmask, code) \
 {\
     int __i, __j, __mask;\
diff --git a/hw/apic.h b/hw/apic.h
index c857d52..3a4be0a 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
                              uint8_t trigger_mode);
 int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
+void apic_deliver_nmi(DeviceState *d);
 int apic_get_interrupt(DeviceState *s);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
diff --git a/monitor.c b/monitor.c
index cb485bf..0b81f17 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict 
*qdict, QObject **ret_data)
     CPUState *env;
 
     for (env = first_cpu; env != NULL; env = env->next_cpu) {
-        cpu_interrupt(env, CPU_INTERRUPT_NMI);
+        if (!env->apic_state) {
+            cpu_interrupt(env, CPU_INTERRUPT_NMI);
+        } else {
+            apic_deliver_nmi(env->apic_state);
+        }
     }
 
     return 0;



reply via email to

[Prev in Thread] Current Thread [Next in Thread]