qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/3] q35: set split kernel irqchip as default


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH v2 1/3] q35: set split kernel irqchip as default
Date: Sat, 27 Apr 2019 01:29:10 -0400 (EDT)

> > In my testing it looks like KVM advertises supporting the KVM_IRQFD
> > resample feature, but vfio never gets the unmask notification, so the
> > device remains with DisINTx set and no further interrupts are
> > generated.  Do we expect KVM's IRQFD with resampler to work in the
> > split IRQ mode?  We can certainly hope that "high performance" devices
> > use MSI or MSI/X, but this would be quite a performance regression with
> > split mode if our userspace bypass for INTx goes away.  Thanks,
> 
> arch/x86/kvm/lapic.c:kvm_ioapic_send_eoi() dumps to userspace before
> kvm_ioapic_update_eoi() can handle the irq_ack_notifier_list via
> kvm_notify_acked_gsi(),

That wouldn't help because kvm_ioapic_update_eoi would not even be
able to access vcpu->kvm->arch.vioapic (it's NULL).

The following untested patch would signal the resamplefd in kvm_ioapic_send_eoi,
before requesting the exit to userspace.  However I am not sure how QEMU
sets up the VFIO eventfds: if I understand correctly, when VFIO writes again to
the irq eventfd, the interrupt request would not reach the userspace IOAPIC, but
only the in-kernel LAPIC.  That would be incorrect, and if my understanding is
correct we need to trigger resampling from hw/intc/ioapic.c.

Something like:

1) VFIO uses kvm_irqchip_add_irqfd_notifier_gsi instead of KVM_IRQFD directly

2) add a NotifierList in kvm_irqchip_assign_irqfd

3) if kvm_irqchip_is_split(), register a corresponding Notifier in ioapic.c 
which
stores the resamplefd as an EventNotifier*

4) ioapic_eoi_broadcast writes to the resamplefd

BTW, how do non-x86 platforms handle intx resampling?

Paolo

---- 8< -----
From: Paolo Bonzini <address@hidden>
Subject: [PATCH WIP] kvm: lapic: run ack notifiers for userspace IOAPIC routes

diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index ea1a4e0297da..be337e06e3fd 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -135,4 +135,6 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu,
                           ulong *ioapic_handled_vectors);
 void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
                            ulong *ioapic_handled_vectors);
+void kvm_notify_userspace_ioapic_routes(struct kvm *kvm, int vector);
+
 #endif
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 3cc3b2d130a0..46ea79a0dd8a 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -435,6 +435,34 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
        srcu_read_unlock(&kvm->irq_srcu, idx);
 }
 
+void kvm_notify_userspace_ioapic_routes(struct kvm *kvm, int vector)
+{
+        struct kvm_kernel_irq_routing_entry *entry;
+        struct kvm_irq_routing_table *table;
+        u32 i, nr_ioapic_pins;
+        int idx;
+
+        idx = srcu_read_lock(&kvm->irq_srcu);
+        table = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
+        nr_ioapic_pins = min_t(u32, table->nr_rt_entries,
+                               kvm->arch.nr_reserved_ioapic_pins);
+
+       for (i = 0; i < nr_ioapic_pins; i++) {
+                hlist_for_each_entry(entry, &table->map[i], link) {
+                        struct kvm_lapic_irq irq;
+
+                        if (entry->type != KVM_IRQ_ROUTING_MSI)
+                                continue;
+
+                       kvm_set_msi_irq(kvm, entry, &irq);
+                       if (irq.level && irq.vector == vector)
+                               kvm_notify_acked_gsi(kvm, i);
+               }
+       }
+
+       srcu_read_unlock(&kvm->irq_srcu, idx);
+}
+
 void kvm_arch_irq_routing_update(struct kvm *kvm)
 {
        kvm_hv_irq_routing_update(kvm);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9f089e2e09d0..8f8e76d77925 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1142,6 +1142,7 @@ static bool kvm_ioapic_handles_vector(struct kvm_lapic 
*apic, int vector)
 
 static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
 {
+       struct kvm_vcpu *vcpu = apic->vcpu;
        int trigger_mode;
 
        /* Eoi the ioapic only if the ioapic doesn't own the vector. */
@@ -1149,9 +1150,10 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, 
int vector)
                return;
 
        /* Request a KVM exit to inform the userspace IOAPIC. */
-       if (irqchip_split(apic->vcpu->kvm)) {
-               apic->vcpu->arch.pending_ioapic_eoi = vector;
-               kvm_make_request(KVM_REQ_IOAPIC_EOI_EXIT, apic->vcpu);
+       if (irqchip_split(vcpu->kvm)) {
+               kvm_notify_userspace_ioapic_routes(vcpu->kvm, vector);
+               vcpu->arch.pending_ioapic_eoi = vector;
+               kvm_make_request(KVM_REQ_IOAPIC_EOI_EXIT, vcpu);
                return;
        }
 
@@ -1160,7 +1162,7 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, 
int vector)
        else
                trigger_mode = IOAPIC_EDGE_TRIG;
 
-       kvm_ioapic_update_eoi(apic->vcpu, vector, trigger_mode);
+       kvm_ioapic_update_eoi(vcpu, vector, trigger_mode);
 }
 
 static int apic_set_eoi(struct kvm_lapic *apic)


> so it looks like KVM really ought to return an
> error when trying to register a resample IRQFD when irqchip_split(),
> but do we have better options?  EOI handling in QEMU is pretty much
> non-existent and even if it was in place, it's a big performance
> regression for vfio INTx handling.  Is a split irqchip that retains
> performant resampling (EOI) support possible?  Thanks,



reply via email to

[Prev in Thread] Current Thread [Next in Thread]