qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU/KVM migration backwards compatibility broken?


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] QEMU/KVM migration backwards compatibility broken?
Date: Thu, 6 Jun 2019 09:42:22 +0100
User-agent: Mutt/1.11.4 (2019-03-13)

* Liran Alon (address@hidden) wrote:
> Hi,
> 
> Looking at QEMU source code, I am puzzled regarding how migration backwards 
> compatibility is preserved regarding X86CPU.
> 
> As I understand it, fields that are based on KVM capabilities and guest 
> runtime usage are defined in VMState subsections in order to not send them if 
> not necessary.
> This is done such that in case they are not needed and we migrate to an old 
> QEMU which don’t support loading this state, migration will still succeed
> (As .needed() method will return false and therefore this state won’t be sent 
> as part of migration stream).
> Furthermore, in case .needed() returns true and old QEMU don’t support 
> loading this state, migration fails. As it should because we are aware that 
> guest state
> is not going to be restored properly on destination.
> 
> I’m puzzled about what will happen in the following scenario:
> 1) Source is running new QEMU with new KVM that supports save of some VMState 
> subsection.
> 2) Destination is running new QEMU that supports load this state but with old 
> kernel that doesn’t know how to load this state.
> 
> I would have expected in this case that if source .needed() returns true, 
> then migration will fail because of lack of support in destination kernel.
> However, it seems from current QEMU code that this will actually succeed in 
> many cases.
> 
> For example, if msr_smi_count is sent as part of migration stream (See 
> vmstate_msr_smi_count) and destination have has_msr_smi_count==false,
> then destination will succeed loading migration stream but kvm_put_msrs() 
> will actually ignore env->msr_smi_count and will successfully load guest 
> state.
> Therefore, migration will succeed even though it should have failed…
> 
> It seems to me that QEMU should have for every such VMState subsection, a 
> .post_load() method that verifies that relevant capability is supported by 
> kernel
> and otherwise fail migration.
> 
> What do you think? Should I really create a patch to modify all these CPUX86 
> VMState subsections to behave like this?

I don't know the x86 specific side that much; but from my migration side
the answer should mostly be through machine types - indeed for smi-count
there's a property 'x-migrate-smi-count' which is off for machine types
pre 2.11 (see hw/i386/pc.c pc_compat_2_11) - so if you've got an old
kernel you should stick to the old machine types.

There's nothing guarding running the new machine type on old-kernels;
and arguably we should have a check at startup that complains if
your kernel is missing something the machine type uses.
However, that would mean that people running with -M pc   would fail
on old kernels.

A post-load is also a valid check; but one question is whether,
for a particular register, the pain is worth it - it depends on the
symptom that the missing state causes.  If it's minor then you might
conclude it's not worth a failed migration;  if it's a hung or
corrupt guest then yes it is.   Certainly a warning printed is worth
it.

Dave

> Thanks,
> -Liran
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]