qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH V15 0/7] Add architecture agnostic code to support vCPU Hotpl


From: Salil Mehta
Subject: RE: [PATCH V15 0/7] Add architecture agnostic code to support vCPU Hotplug
Date: Tue, 16 Jul 2024 11:43:00 +0000

Hi Igor,

>  From: Igor Mammedov <imammedo@redhat.com>
>  Sent: Tuesday, July 16, 2024 10:52 AM
>  To: Salil Mehta <salil.mehta@opnsrc.net>
>  
>  On Tue, 16 Jul 2024 03:38:29 +0000
>  Salil Mehta <salil.mehta@opnsrc.net> wrote:
>  
>  > Hi Igor,
>  >
>  > On 15/07/2024 15:11, Igor Mammedov wrote:
>  > > On Mon, 15 Jul 2024 14:19:12 +0000
>  > > Salil Mehta <salil.mehta@huawei.com> wrote:
>  > >
>  > >>>   From: qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org  <qemu-
>  > >>>   arm-bounces+salil.mehta=huawei.com@nongnu.org> On Behalf Of  Salil
>  > >>>   Mehta via
>  > >>>   Sent: Monday, July 15, 2024 3:14 PM
>  > >>>   To: Igor Mammedov <imammedo@redhat.com>
>  > >>>
>  > >>>   Hi Igor,
>  > >>>
>  > >>>   >  From: Igor Mammedov <imammedo@redhat.com>
>  > >>>   >  Sent: Monday, July 15, 2024 2:55 PM
>  > >>>   >  To: Salil Mehta <salil.mehta@huawei.com>
>  > >>>   >
>  > >>>   >  On Sat, 13 Jul 2024 19:25:09 +0100
>  > >>>   >  Salil Mehta <salil.mehta@huawei.com> wrote:
>  > >>>   >
>  > >>>   >  > [Note: References are present at the last after the revision
>  > >>>   > history]  >  > Virtual CPU hotplug support is being added across
>  > >>>   > various architectures  [1][3].
>  > >>>   >  > This series adds various code bits common across all 
> architectures:
>  > >>>   >  >
>  > >>>   >  > 1. vCPU creation and Parking code refactor [Patch 1] 2. Update 
> ACPI
>  > >>>   > > GED framework to support vCPU Hotplug [Patch 2,3] 3. ACPI CPUs 
> AML
>  > >>>   > > code change [Patch 4,5] 4. Helper functions to support 
> unrealization
>  > >>>   > > of CPU objects [Patch 6,7]
>  > >>>   >
>  > >>>   >  with patch 1 and 3 fixed should be good to go.
>  > >>>   >
>  > >>>   >  Salil,
>  > >>>   >  Can you remind me what happened to migration part of this?
>  > >>>   >  Ideally it should be a part of of this series as it should be 
> common
>  > >>>   > for  everything that uses GED and should be a conditional part of
>  > >>>   > GED's  VMSTATE.
>  > >>>   >
>  > >>>   >  If this series is just a common base and no actual hotplug on top 
> of
>  > >>>   > it is  merged in this release (provided patch 13 is fixed), I'm 
> fine
>  > >>>   > with migration  bits being a separate series on top.
>  > >>>   >
>  > >>>   >  However if some machine would be introducing cpu hotplug in the 
> same
>  > >>>   > release, then the migration part should be merged before it or be a
>  > >>>   > part  that cpu hotplug series.
>  > >>>
>  > >>>   We have tested Live/Pseudo Migration and it seem to work with the
>  > >>>   changes part of the architecture specific patch-set.
>  > >
>  > > have you tested, migration from new QEMU to an older one (that doesn't 
> have cpuhotplug builtin)?
>  >
>  >
>  > Just curious, how can we detect at source Qemu what version of the
>  > Qemu destination is running. We require some sort of compatibility
>  > check but then this is a problem not specific to CPU Hotplug?
>  
>  it's usually managed by version machine types + compat settings for
>  machine/device.

Ok. it looks to be a static checking at the source. I'm sure there must be
a way to dynamically do the same by negotiating the features i.e. only
enabling the common subset at the destination. I quickly skimmed the
migration code and I cannot find any thing like this being done as of now.
And this problem looks to be a pandoras box to me. 

>  
>  > We  are not initializing CPU Hotplug VMSD in this patch-set. I was
>  > wondering then how can a new machine attempt to migrate VMSD state
>  > from new Qemu to older Qemu.
>  
>  If I'm not mistaken without VMSD it shouldn't explode, since CPUHP code
>  shouldn't create memory-regions that are migrated.
>  (If I recall correctly, mmio regions aren't going into migration stream)

Correct.

>  
>  > ARM vCPU Hotplug patches will be on top of this later in next Qemu cycle.
>  then it's fine to introduce VMSD later on, just make sure others who adding
>  cpu hotplug elsewhere also aware of it and pickup the same patch.


Yes, thanks.


>  > >>>
>  > >>>   Ampere: https://lore.kernel.org/all/e17e28ac-28c7-496f-b212-
>  > >>>   2c9b552dbf63@amperemail.onmicrosoft.com/
>  > >>>   Oracle: https://lore.kernel.org/all/46D74D30-EE54-4AD2-8F0E-
>  > >>>   BA5627FAA63E@oracle.com/
>  > >>>
>  > >>>
>  > >>>   For ARM, please check below patch part of RFC V3 for changes related 
> to
>  > >>>   migration:
>  > >>>   https://lore.kernel.org/qemu-devel/20240613233639.202896-15-
>  > >>>   salil.mehta@huawei.com/
>  > >>
>  > >>
>  > >> Do you wish to move below change into this path-set and make it
>  > >> common to all instead?
>  > >
>  > > it would be the best to include this with here.
>  > >
>  > >>
>  > >>
>  > >> diff --git a/hw/acpi/generic_event_device.c
>  > >> b/hw/acpi/generic_event_device.c index 63226b0040..e92ce07955
>  > >> 100644
>  > >> --- a/hw/acpi/generic_event_device.c
>  > >> +++ b/hw/acpi/generic_event_device.c
>  > >> @@ -333,6 +333,16 @@ static const VMStateDescription
>  vmstate_memhp_state = {
>  > >>       }
>  > >>   };
>  > >>
>  > >> +static const VMStateDescription vmstate_cpuhp_state = {
>  > >> +    .name = "acpi-ged/cpuhp",
>  > >> +    .version_id = 1,
>  > >> +    .minimum_version_id = 1,
>  > >> +    .fields      = (VMStateField[]) {
>  > >> +        VMSTATE_CPU_HOTPLUG(cpuhp_state, AcpiGedState),
>  > >> +        VMSTATE_END_OF_LIST()
>  > >> +    }
>  > >> +};
>  > >> +
>  > >>   static const VMStateDescription vmstate_ged_state = {
>  > >>       .name = "acpi-ged-state",
>  > >>       .version_id = 1,
>  > >> @@ -381,6 +391,7 @@ static const VMStateDescription
>  vmstate_acpi_ged = {
>  > >>       },
>  > >>       .subsections = (const VMStateDescription * const []) {
>  > >>           &vmstate_memhp_state,
>  > >> +        &vmstate_cpuhp_state,
>  > >
>  > > I'm not migration guru but I believe this should be conditional to
>  > > avoid breaking cross-version migration.
>  > > See 679dd1a957d '.needed = vmstate_test_use_cpuhp. part
>  >
>  >
>  > Sure, thanks for this. As I can see, the needed() function is used at
>  > the source to decide if the state corresponding to a particular device
>  > can be forwarded to the destination QEMU/VM. But how can this be used
>  > to check for cross-version migration?
>  
>  what I'd do is to make sure that older machine types to not have cpu
>  hotplug enabled in supported events, and only machine that has full
>  support for hotplug enabled the bit. And then machine_init depending on
>  that would manage actual 'ged-event'
>  property.
>  
>  Then later VMSD.needed would check ged-event to decide if section should
>  be used or omitted.

I understand this part, as discussed above this is not relevant to this 
patch-set
as we are not adding CPU Hotplug specific VMSD. But yes, I take your point and
will add above suggested change in the following patches in the next Qemu Cycle.


>  > BTW, I've prepared V16. May I request a quick peek at:
>  >
>  > https://github.com/salil-mehta/qemu/commits/virt-cpuhp-armv8/rfc-
>  v3.ar
>  > ch.agnostic.v16/
>  
>  looked at
>     hw/acpi: Update ACPI GED framework to support vCPU Hotplug
>  
>  I get that ged_event loop in realize was copy-pasted from _EVT handler, but
>  that looks a bit complicated (though I won't object, it's matter of taste)

Yes, I did that intentionally to increase the code reuse which usually is 
encouraged.
This reduces the amount of re-testing required. But yes, there is a con side as
well to this approach if not done properly i.e. any issues in the older code 
also
get propagated in the newer code.

Generally we bank upon FFS function to get the next bit and use bit shifting to
deal with this kind of logic efficiently. 

>  
>  I'd prefer simpler condition than for() {} loop, and just use simpler 'if'
>  
>  if (enabled_events & ACPI_GED_CPU_HOTPLUG_EVT) {
>      init cpu hp code
>  }
>  
>  PS:
>  if you keep for loop, I'd replace  error_report() + abort() with 
> 'error_abort'

Maybe I should have looked a this earlier. I've  missed to address this in V16.
Do you mean error_setg(&error_abort,....) kind of logic? it has been discouraged
in favor of assert() I think?

Can we live with this for now or do you want me to send V17 for this change?


Thanks
Salil.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]