qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for


From: Jan Kiszka
Subject: Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU
Date: Tue, 26 Apr 2016 09:57:51 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2016-04-26 09:34, Peter Xu wrote:
> On Mon, Apr 25, 2016 at 09:24:12AM +0200, Jan Kiszka wrote:
>> On 2016-04-25 09:18, Peter Xu wrote:
>>> On Mon, Apr 25, 2016 at 07:16:19AM +0200, Jan Kiszka wrote:
>>>> On 2016-04-19 10:38, Peter Xu wrote:
>>>
>>> [...]
>>>
>>>>> By default, IR is disabled to be better compatible with current
>>>>> QEMU. To enable IR, we can using the following command to boot a
>>>>> IR-supported VM with virtio-net device with vhost (still do not
>>>>> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
>>>>> here):
>>>>>
>>>>> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
>>>>
>>>> "intr" sounds a bit too much like "interrupt", not "interrupt
>>>> remapping". Why not use the kernel's form, "intremap"?
>>>
>>> Sure. It sounds nice to be aligned with the kernel one. Let me take
>>> it in v5.
>>>
>>>>
>>>>>      -enable-kvm -m 1024 \
>>>>>    -netdev tap,id=net0,vhost=on \
>>>>>      -device virtio-net-pci,netdev=user.0 \
>>>>>      -monitor telnet::3333,server,nowait \
>>>>>    /var/lib/libvirt/images/vm1.qcow2
>>>>>
>>>>> When guest boots, we can verify whether IR enabled by grepping the
>>>>> dmesg like:
>>>>>
>>>>> address@hidden ~]# journalctl -k | grep "DMAR-IR"
>>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under 
>>>>> DRHD base  0xfed90000 IOMMU 0
>>>>> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ 
>>>>> remapping in xapic mode
>>>>>
>>>>> Currently supported devices:
>>>>>
>>>>> - Emulated/Splitted irqchip
>>>>> - Generic PCI Devices
>>>>> - vhost devices
>>>>> - pass through device support? Not tested, but suppose it should work.
>>>>
>>>> I've tested this series against my Jailhouse setup, and it works pretty
>>>> well! Actually considering to move my test setup over this branch.
>>>
>>> This is really encouraging feedback! Btw, thanks for all kinds of
>>> help on this patchset. :-)
>>>
>>>>
>>>> However, split irqchip still has some issues: When I boot a q35 machine
>>>> with Linux, the e1000 network adapter only gets a single IRQ delivered.
>>>> Interestingly, other IOAPIC IRQs like the keyboard work all the time. I
>>>> didn't debug this in details yet.
>>>
>>> I reproduced this problem. It seems that it fails even with
>>> kernel-irqchip=off. Will try to dig it out.
>>
>> Very good. Hope it can be easily fixed.
> 
> Hi, Jan,
> 
> The above issue should be caused by EOI missing of level-triggered
> interrupts. Before that, I was always using edge-triggered
> interrupts for test, so didn't encounter this one. Would you please
> help try below patch? It can be applied directly onto the series,
> and should solve the issue (it works on my test vm, and I'll take it
> in v5 as well if it also works for you):
> 

Works here as well. I even made EIM working with some hack, though
Jailhouse spits out strange warnings, despite it works fine (x2apic
mode, split irqchip).

> -------------------------
> 
> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> index b41ab89..de6a8cf 100644
> --- a/hw/intc/ioapic.c
> +++ b/hw/intc/ioapic.c
> @@ -281,6 +281,36 @@ ioapic_mem_read(void *opaque, hwaddr addr, unsigned int 
> size)
>      return val;
>  }
> 
> +/*
> + * This is to satisfy the hack in Linux kernel. One hack of it is to
> + * simulate clearing the Remote IRR bit of IOAPIC entry using the
> + * following:
> + *
> + * "For IO-APIC's with EOI register, we use that to do an explicit EOI.
> + * Otherwise, we simulate the EOI message manually by changing the trigger
> + * mode to edge and then back to level, with RTE being masked during
> + * this."
> + *
> + * (See linux kernel __eoi_ioapic_pin() comment in commit c0205701)
> + *
> + * This is based on the assumption that, Remote IRR bit will be
> + * cleared by IOAPIC hardware for edge-triggered interrupts (I
> + * believe that's what the IOAPIC version 0x1X hardware does). So
> + * if we are emulating it, we'd better do it the same here, so that
> + * the guest kernel hack will work as well on QEMU.
> + *
> + * Without this, level-triggered interrupts in IR mode might fail to
> + * work correctly.
> + */
> +static inline void
> +ioapic_fix_edge_remote_irr(uint64_t *entry)
> +{
> +    if (*entry & IOAPIC_LVT_TRIGGER_MODE) {
> +        /* Level triggered interrupts, make sure remote IRR is zero */
> +        *entry &= ~((uint64_t)IOAPIC_LVT_REMOTE_IRR);
> +    }
> +}
> +
>  static void
>  ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
>                   unsigned int size)
> @@ -314,6 +344,7 @@ ioapic_mem_write(void *opaque, hwaddr addr, uint64_t val,
>                      s->ioredtbl[index] &= ~0xffffffffULL;
>                      s->ioredtbl[index] |= val;
>                  }
> +                ioapic_fix_edge_remote_irr(&s->ioredtbl[index]);
>                  ioapic_service(s);
>              }
>          }
> 
> ------------------------
> 
> I am still looking into guest part codes. Although the above patch
> should solve the issue, there are still issues in guest codes when
> IR is enabled:
> 
> - mismatched "vector" in IOAPIC entry and IRTE entry (this is
>   required in vt-d spec 5.1.5.1, and required to correctly deliver
>   EOI broadcast I guess). See intel_irq_remapping_prepare_irte():
> 
>         ...
>         /*
>          * IO-APIC RTE will be configured with virtual vector.
>          * irq handler will do the explicit EOI to the io-apic.
>          */
>         entry->vector   = info->ioapic_pin;
>         ...
> 
> - I encountered that level-triggered entries in IOAPIC is marked as
>   edge-triggered interrupt in APIC (which is strange)... This will
>   also affect correct delivery of EOI broadcast. I still need time
>   to figure out why.
> 
> If EOI broadcast can work, e1000 issue would be solved as
> well even without above patch.
> 
> [...]

I don't remember details in this area, but maybe it's worth to look how
my hacks dealt with these cause (or made Linux to not create such weird
configurations).

Jan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]