[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] vfio pci: kernel support of error recovery only
From: |
Cao jin |
Subject: |
Re: [Qemu-devel] [PATCH] vfio pci: kernel support of error recovery only for non fatal error |
Date: |
Tue, 21 Mar 2017 16:05:28 +0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 |
On 03/20/2017 10:30 PM, Alex Williamson wrote:
> On Mon, 20 Mar 2017 20:50:39 +0800
> Cao jin <address@hidden> wrote:
>
>> Sorry for late.
>>
>> On 03/14/2017 06:06 AM, Alex Williamson wrote:
>>> On Mon, 27 Feb 2017 15:28:43 +0800
>>> Cao jin <address@hidden> wrote:
>>>
>>>> 0. What happens now (PCIE AER only)
>>>> Fatal errors cause a link reset.
>>>> Non fatal errors don't.
>>>> All errors stop the VM eventually, but not immediately
>>>> because it's detected and reported asynchronously.
>>>> Interrupts are forwarded as usual.
>>>> Correctable errors are not reported to guest at all.
>>>> Note: PPC EEH is different. This focuses on AER.
>>>
>>> Perhaps you're only focusing on AER, but don't the error handlers we're
>>> using support both AER and EEH generically? I don't think we can
>>> completely disregard how this affects EEH behavior, if at all.
>>>
>>
>> After taking a rough look at the EEH, find that EEH always feed
>> error_detected with pci_channel_io_frozen, from perspective of
>> error_detected, EEH is not affected.
>>
>> I am not sure about a question: when assign devices in spapr host,
>> should all functions/devices in a PE be bound to vfio? I am kind of
>> confused about the relationship between a PE & a tce iommu group
>
> AIUI, yes all devices within the PE are part of the same IOMMU group
> and therefore all endpoints must be bound to vfio or pci-stub.
>
Thanks. Then I think this approach won't affect EEH. I was considering
the same issue you mentioned for slot_reset may affect EEH, but if they
all must be bound to vfio, seems the issue won't happen to EEH.
--
Sincerely,
Cao jin