qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Query] Live Migration between machines with different


From: Jaggi, Manish
Subject: Re: [Qemu-devel] [Query] Live Migration between machines with different processor ids
Date: Wed, 29 Aug 2018 12:40:08 +0000


> On 28-Aug-2018, at 10:57 PM, Dr. David Alan Gilbert <address@hidden> wrote:
> 
> External Email
> 
> (Cc'ing in Eric, Drew, and Peter for ARM stuff)
> 
Thanks,
> * Jaggi, Manish (address@hidden) wrote:
>> 
>> 
>>> On 23-Aug-2018, at 7:59 PM, Juan Quintela <address@hidden> wrote:
>>> 
>>> External Email
>>> 
>>> "Jaggi, Manish" <address@hidden> wrote:
>>>> Hi,
>>> 
>>> Hi
>>> 
>>> [Note that I was confused about what do you mean with problems with
>>> processorID.  There is no processorID on the migration stream, so I
>>> didn't understood what you were talking about.  Until I realized that
>>> you were trying to migrate from different cpu types]
>>> 
>>>> Posting again with my cavium ID and CCing relevant folks
>>> 
>>> It will be good to give What architecture are we talking about?  MIPS,
>>> ARM, anything else?
>>> 
>> arm64
>> 
>>> Why?  Because we do this continously on x86_64 world.  How do we do
>>> this?  We emulate the _processor_ capabilities, so "in general" you can
>>> always migrate from a processor to another with a superset of the
>>> features.  If you look at the ouput of:
>>> 
>>>   qemu-system-x86_64 -cpu ?
>>> 
>>> You can see that we have lots of cpu types that we emulate and cpuid
>>> (features really).  Migration intel<->amd is tricky.  But from "intel
>>> with less features" to "intel with more features" (or the same with AMD)
>>> it is a common thing to do.  Once told that, it is a lot of work, simple
>>> things like that processors run at different clock speeds imply that you
>>> need to be careful during migration with timers and anything that
>>> depends on frequencies.
>>> 
>>> I don't know enough about other architectures to know how to do it, or
>>> how feasible is.
>> 
>> For arm64 qemu/kvm throws an error when processorID does not match.
>>> 
>>>> Live Migration between machines with different processorIds
>>>> 
>>>> VM Migration between machines with different processorId values throws
>>>> error in qemu/kvm. Though this check is appropriate but is overkill where
>>>> two machines are of same SoC/arch family and have same core/gic but
>>>> delta could be in other parts of Soc which have no effect on VM
>>>> operation.
>>> 
>>> Then you need to do the whole process of:
>>> 
>>> Lets call both processors A1 and A2.  You need to do the whole process
>>> of:
>>> 
>>> a- defining cpu A1
>>> b- make sure that when you run qemu/kvm on processor A2, the
>>> features/behaviours that the guest sees.  This is not trivial at
>>> all.
>>> c- when migration comes, you can see that you need to adjust to whatever
>>> is the architecture of the destination.
>>> 
>>>> There could be two ways to address this issue by ignoring the
>>>> comparison of processorIDs and so need feedback from the
>>>> community on this.
>>>> 
>>>> a) Maintain a whitelist in qemu:
>>>> 
>>>> This will be a set of all processorIds which are compatible and migration 
>>>> can
>>>> happen between any of the machines with the Ids from this set. This set can
>>>> be statically built within qemu binary.
>>> 
>>> In general, I preffer whitelists over blacklists.
>>> 
>>>> b) Provide an extra option with migrate command
>>>> 
>>>> migrate tcp:<ip>:<port>:<dest_processor_id>
>>>> 
>>>> This is to fake the src_processor_id as dest_processor_id, so the qemu 
>>>> running
>>>> on destination machine will not complain. The overhead with this approach 
>>>> is
>>>> that the destination machines Id need to be known beforehand.
>>> 
>>> Please, don't even think about this:
>>> a- migration commands are architecture agnostic
>>> b- in general it is _much_, _much_ easier to fix things on destination
>>> that on source.
>>> 
>>>> If there is some better way… please suggest.
>>> 
>>> Look at how it is done on x86_64.  But be aware that "doing it right"
>>> takes a lot of work.  To give you one idea:
>>> - upstream, i.e. qemu, "warantee" that migration of:
>>> qemu-X -M machine-type-X -> qemu-Y -M machine-type-X
>>> works when X < Y.
>>> 
>>> - downstream (i.e. redhat on my case, but I am sure that others also
>>> "suffer" this)  allow also:
>>> 
>>> qemu-Y -M machine-type-X -> qemu-X -M machine-type-X (Y > X)
>>> 
>>> in general it is a very complicated problem, so we limit _what_ you
>>> can do.  Basically we only support our machine-types, do a lot of
>>> testing, and are very careful when we add new features.  I.e. be
>>> preparred to do a lot of testing and a lot of fixing.
>> 
>> At this point I am targeting a simpler case where Machine A1 and A2 has a 
>> core from the same SoC family.
>> For example Cavium ThunderX2 Core incremental versions which has identical 
>> core and GIC and may have some errata fixes.
>> In that case Y=X since migration only takes care of PV devices.
>> 
>> In that case a whitelist could be an easier option?
>> 
>> How to provide the whitelist to qemu in a platform agnostic way?
>> - I will look into intel model as you have suggested, does intel keeps a 
>> whitelist or masks off some bits of processorID
>> How does intel does it
> 
> Purely based on features rather than IDs.
> 
> If it's an Intel processor and it's got that set of CPU features
> migration to it will normally work.
> (There are some gotcha's that we hit from time to time, but
> the basic idea holds)
> 

Just to add what happens in ARM64 case, qemu running on Machine A sends cpu 
state information to Machine B.
This state contains MIDR value, and so Processor ID value is compared in KVM 
and not in qemu (correcting myself).

IIRC, Peter/Eric please point if there is something incorrect in the below 
flow...

(Machine B)
target/arm/machine.c: cpu_post_load()
                - updates cpu->cpreg_values[i] : which includes MIDR (processor 
ID register)

                - calls write_list_to_kvmstate(cpu, KVM_PUT_FULL_STATE)

                                target/arm/kvm.c: write_list_to_kvmstate
                                - calls => kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, 
&r);

                                        => and it eventually lands up IIRC in 
Linux code in 

                                                        => 
arch/arm64/kvm/sys_regs.c : set_invariant_sys_reg(u64 id, void __user *uaddr)
                                                                /* This is what 
we mean by invariant: you can't change it. */
                                                                if (r->val != 
val)
                                                                        return 
-EINVAL;
                                                                Note: MIDR_EL1 
is invariant register.
result: Migration fails on Machine B.

A few points:
- qemu on arm64 is invoked with -machine virt and -cpu as host. So we don't 
explicitly define which cpu. 

- In case Machine A and Machine B have almost same Core and the delta may-not 
have any effect on qemu operation, migration should work by just looking into 
whitelist.
whitelist can be given as a parameter for qemu on machine B.

qemu-system-aarch64 -whitelist <ids separated by commas>

(This is my proposal)

- So in cpu_post_load (Machine B) qemu can lookup whitelist and replace the 
MIDR with the one at Machine B. 
Sounds good?

- Juan raised a point about clock speed, I am not sure it will have any effect 
on arm since qemu is run with -cpu host param.
I could be wrong here, Peter/Eric can you please correct me...

-Thanks
Manish



> Dave
>> - is providing a -mirate-compat-whitelist <file> option for arm only looks 
>> good?
>> this option can be added in A1/A2 qemu command, so it would be upstream / 
>> downstream agnostic.
> 
>>> 
>>> I am sorry to not be able to tell you that this is an easy problem.
>>> 
>>> Later, Juan.
>> 
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK


reply via email to

[Prev in Thread] Current Thread [Next in Thread]