qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 00/18] IOMMU: Enable interrupt remapping for


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH v5 00/18] IOMMU: Enable interrupt remapping for Intel IOMMU
Date: Tue, 3 May 2016 14:00:42 +0800
User-agent: Mutt/1.5.24 (2015-08-30)

On Tue, May 03, 2016 at 07:40:50AM +0200, Jan Kiszka wrote:
> On 2016-05-03 07:30, Peter Xu wrote:
> > On Tue, May 03, 2016 at 06:38:28AM +0200, Jan Kiszka wrote:
> >> On 2016-05-03 05:22, Peter Xu wrote:
> >>> On Fri, Apr 29, 2016 at 09:52:14PM +0200, Radim Krčmář wrote:
> >>>> 2016-04-28 17:18+0800, Peter Xu:
> >>>>> On Thu, Apr 28, 2016 at 09:19:28AM +0200, Jan Kiszka wrote:
> >>>>>> Instead of fiddling with irq routes for the IOAPIC - where we don't 
> >>>>>> need
> >>>>>> it -, I would suggest to do the following: Send IOAPIC events via
> >>>>>> kvm_irqchip_send_msi to the kernel. Only irqfd users (vhost, vfio)
> >>>>>> should use the pattern you are now applying to the IOAPIC: establish
> >>>>>> static routes as an irqfd is set up, and that route should be 
> >>>>>> translated
> >>>>>> by the iommu first, register an IEC notifier to update any affected
> >>>>>> route when the iommu translation changes.
> >>>>>
> >>>>> Yes, maybe that's the right thing to do. Or say, when split irqchip,
> >>>>> IOAPIC can avoid using GSI routes any more. If with that, I should
> >>>>> also remove lots of things, like: IEC notifiers for IOAPIC, and all
> >>>>> things related to msi route sync-up in IOAPIC codes with KVM (so I
> >>>>> suppose we will save 24 gsi route entries for KVM, which sounds
> >>>>> good).
> >>>>
> >>>> Sadly, we can't get rid of those GSI routes.  KVM uses them to decide
> >>>> whether it should forward EOI to userspace.  And QEMU also has to remap
> >>>> them, because KVM uses dest_id for that decision.
> >>>
> >>> Thanks to point out.  Actually I have started to test the new patch
> >>> and did encountered issue when using split irqchip mode. The above
> >>> explains. :)
> >>>
> >>> So I think I will keep this part of codes un-touched in this
> >>> series, and I can move forward with IR changes.
> >>>
> >>> However, I am still wondering whether we can avoid this in KVM when
> >>> when using split irqchip mode? E.g., what if we do not do this check
> >>> in KVM and let all EOI be passed to userspace? Like:
> >>>
> >>> ---
> >>>
> >>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >>> index 36591fa..c0a6e51 100644
> >>> --- a/arch/x86/kvm/lapic.c
> >>> +++ b/arch/x86/kvm/lapic.c
> >>> @@ -936,10 +936,6 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic 
> >>> *apic, int vector)
> >>>  {
> >>>         int trigger_mode;
> >>>
> >>> -       /* Eoi the ioapic only if the ioapic doesn't own the vector. */
> >>> -       if (!kvm_ioapic_handles_vector(apic, vector))
> >>> -               return;
> >>> -
> >>>         /* Request a KVM exit to inform the userspace IOAPIC. */
> >>>         if (irqchip_split(apic->vcpu->kvm)) {
> >>>                 apic->vcpu->arch.pending_ioapic_eoi = vector;
> >>> @@ -947,6 +943,10 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic 
> >>> *apic, int vector)
> >>>                 return;
> >>>         }
> >>>
> >>> +       /* Eoi the ioapic only if the ioapic doesn't own the vector. */
> >>> +       if (!kvm_ioapic_handles_vector(apic, vector))
> >>> +               return;
> >>> +
> >>>         if (apic_test_vector(vector, apic->regs + APIC_TMR))
> >>>                 trigger_mode = IOAPIC_LEVEL_TRIG;
> >>>         else
> >>>
> >>> ---
> >>>
> >>> I believe this will brings more user-space context switches, I just
> >>> do not know how this will impact the system (only for curiosity :).
> >>
> >> Yes, this doesn't look good from the performance POV. Given that most
> >> EOIs of the APICs will not trigger a message to an IOAPIC and userspace
> >> exits are expensive, that should be measurable.
> >>
> >> But you should optimize route updating a bit: I noticed real delays
> >> (almost a second) when reprogramming all IOAPIC pins during Jailhouse
> >> handover. That's because of the heavyweight synchronize_srcu we have on
> >> each route change. Even if that is an extreme case, you should try to
> >> reduce route updates to only those that were actually affected by an
> >> IRTE change.
> > 
> > Yes, this solution will possibly need one more IOMMU API besides
> > int_remap(), for IOAPIC to know whether specific IOAPIC entry needs
> > to be updated (IOAPIC needs to know the index information of each
> > IOAPIC entry, and see whether it is matched with IEC notified index
> > and mask, in ioapic_iec_notifier()).
> > 
> > Another problem that may cause this issue is that: now we are
> > committing route information every time when updating route entries
> > (please check kvm_update_routing_entry()). IMHO this is not
> > necessary - we will need 25 times of committing for each IEC
> > invalication, while actually we only need one at the end.
> > 
> > So... We'll have another solution for this: how about not committing
> > for each update, and just do one commit after all changes settled?
> > This seems to be much cleaner, and easier comparing to the above
> > proposal.
> 
> If you see a number of invalidation requests queued in a row, you may
> also process them atomically from the guest POV, which includes a single
> route update at the end, before the guest gets told that everything was
> done.

Yes, this would be better as a 3rd step after the above two. If you
would not mind, I'll noting this down as well with the 1st proposal
(considering that this will need more code changes, not only in QI
logic, also in IEC notification path), and will only contain the 2nd
change (explicit route commit) in v6, to partially solve the problem
for now.

Thanks,

-- peterx



reply via email to

[Prev in Thread] Current Thread [Next in Thread]