qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics


From: Alexey Kardashevskiy
Subject: Re: [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics
Date: Sat, 12 Apr 2014 02:30:02 +1000
User-agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

On 04/12/2014 02:15 AM, Alexander Graf wrote:
> 
> On 11.04.14 18:01, Alexey Kardashevskiy wrote:
>> On 04/12/2014 01:38 AM, Alexander Graf wrote:
>>> On 11.04.14 17:27, Alexey Kardashevskiy wrote:
>>>> On 04/12/2014 12:58 AM, Alexander Graf wrote:
>>>>> On 11.04.14 16:50, Alexey Kardashevskiy wrote:
>>>>>> On 04/11/2014 11:58 PM, Alexander Graf wrote:
>>>>>>> On 11.04.2014, at 14:38, Alexey Kardashevskiy <address@hidden> wrote:
>>>>>>>
>>>>>>>> On 04/11/2014 07:24 PM, Alexander Graf wrote:
>>>>>>>>> On 10.04.14 16:43, Alexey Kardashevskiy wrote:
>>>>>>>>>> On 04/10/2014 11:26 PM, Alexander Graf wrote:
>>>>>>>>>>> On 10.04.14 15:24, Alexey Kardashevskiy wrote:
>>>>>>>>>>>> On 04/10/2014 10:51 PM, Alexander Graf wrote:
>>>>>>>>>>>>> On 14.03.14 05:18, Alexey Kardashevskiy wrote:
>>>>>>>>>>>>>> The current allocator returns IRQ numbers from a pool and
>>>>>>>>>>>>>> does not
>>>>>>>>>>>>>> support IRQs reuse in any form as it did not keep track of
>>>>>>>>>>>>>> what it
>>>>>>>>>>>>>> previously returned, it only had the last returned IRQ.
>>>>>>>>>>>>>> However migration may change interrupts for devices depending on
>>>>>>>>>>>>>> their order in the command line.
>>>>>>>>>>>>> Wtf? Nonono, this sounds very bogus and wrong. Migration
>>>>>>>>>>>>> shouldn't
>>>>>>>>>>>>> change
>>>>>>>>>>>>> anything.
>>>>>>>>>>>> I put wrong commit message. By change I meant that the default
>>>>>>>>>>>> state
>>>>>>>>>>>> before
>>>>>>>>>>>> the destination guest started accepting migration is different
>>>>>>>>>>>> from
>>>>>>>>>>>> what
>>>>>>>>>>>> the destination guest became after migration finished. And
>>>>>>>>>>>> migration
>>>>>>>>>>>> cannot
>>>>>>>>>>>> avoid changing this default state.
>>>>>>>>>>> Ok, why is the IRQ configuration different?
>>>>>>>>>> Because QEMU creates devices in the order as in the command line,
>>>>>>>>>> and
>>>>>>>>>> libvirt changes this order - the XML used to create the guest and
>>>>>>>>>> the
>>>>>>>>>> XML
>>>>>>>>>> which is sends during migration are different. libvirt thinks it
>>>>>>>>>> is ok
>>>>>>>>>> while it keeps @reg property for (for example) spapr-vscsi devices
>>>>>>>>>> but it
>>>>>>>>>> is not because since the order is different, devices call IRQ
>>>>>>>>>> allocator in
>>>>>>>>>> different order and get different IRQs.
>>>>>>>>> So your patch migrates the current IRQ configuration, but once you
>>>>>>>>> restart
>>>>>>>>> the virtual machine on the destination host it will have different
>>>>>>>>> IRQ
>>>>>>>>> numbering again, right?
>>>>>>>> No, why? IRQs are assigned at init time from realize() callbacks (and
>>>>>>>> survive reset) or as a part of ibm,change-msi rtas call which
>>>>>>>> happens in
>>>>>>>> the same order as it only depends on pci addresses and we do not
>>>>>>>> change
>>>>>>>> this either.
>>>>>>> Ok, let me rephrase. If I shut the machine down because I'm doing
>>>>>>> on-disk hibernate and then boot it back up, will the guest find the
>>>>>>> same
>>>>>>> configuration?
>>>>>> I do not understand what you mean by this. Hibernation by the guest OS
>>>>>> itself or by QEMU? If this involves QEMU exit and QEMU start - then yes,
>>>>> by the guest OS. The host will only see a genuine "shutdown" event. The
>>>>> guest OS will expect the machine to look *the exact same* as before the
>>>>> shutdown.
>>>> Ok. So. I have to implement "irq" property everywhere (PHB is missing
>>>> INTA/B/C/D now) and check if they did not change during migration via
>>>> those
>>> Hrm. Not sure. Maybe it'd make sense to join next week's call on platform
>>> device creation. The problem seems pretty closely related.
>> What are those platform devices and what are you going to discuss exactly?
> 
> Devices that don't have a unified interrupt routing scheme like PCI where
> you just link lines A/B/C/D to your controller and you're good to go.

Ah. VIO in my case.



>>>> VMSTATE.*EQUAL. Correct?
>>> Why would you need this? I think we already said a couple dozen times that
>>> configuration matching is a bigger problem, no?
>> For debug! It is not needed in general, yes.
>>
>>
>>>> If so (more or less), I still would like to keep patches 1..7.
>>>> In fact, the first one is independent and we need it anyway.
>>>> Yes/no?
>>> Why?
>> IOMMUs do not migrate correctly - they only have a class have and
>> instance_id and this instance_it depends on command line arguments order.
>> The #1 patch makes it classname + liobn.
> 
> Why do we need a bus for that?


For BusClass::get_dev_path callback to get an unique name.



>>>>>> config may be different. If it is "migrate to file" and then "migrate
>>>>>> from
>>>>>> file" (do not know what you call it when migration goes to a pipe
>>>>>> which is
>>>>>> "tar") - then config will be the same.
>>>>>>
>>>>>>
>>>>>>>>> I'm not sure that's a good solution to the problem. I guess we should
>>>>>>>>> rather aim to make sure that we can make IRQ allocation explicit.
>>>>>>>>> Fundamentally the problem sounds very similar to the PCI slot
>>>>>>>>> allocation
>>>>>>>>> which eventually got solved by libvirt specifying the slots manually.
>>>>>>>> We can do that too. Who decides? :)
>>>>>>> The better solution wins :)
>>>>>> We both know who decides ;) I posted series, I need heads up if it is
>>>>>> going
>>>>>> the right way or not.
>>>>> It's not :). If a guest may not have different IRQ allocation after
>>>>> migration, it also must not have different IRQ allocation after
>>>>> shutdown +
>>>>> restart.
>>>> Ok. That's good answer, thanks. How does x86 work then? IRQs are hardcoded
>>>> (some are for sure but I do not know about MSI)? Or in order to support
>>> Non-PCI IRQs are hardcoded, yes. PCI IRQs are mapped to one of the 4 PCI
>>> interrupts which again are hardcoded to IOAPIC interrupt lines after some
>>> PCI line swizzling.
>> This is what I meant - I need to have a way to tell PHB IRQ numbers for
>> INTA/B/C/D.
> 
> Yes, just like platform devices ;).




-- 
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]