qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Query] Live Migration between machines with different


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [Query] Live Migration between machines with different processor ids
Date: Tue, 4 Sep 2018 11:32:14 +0100
User-agent: Mutt/1.10.1 (2018-07-13)

* Andrew Jones (address@hidden) wrote:
> On Tue, Sep 04, 2018 at 09:16:58AM +0000, Jaggi, Manish wrote:
> > > On 31-Aug-2018, at 4:41 PM, Andrew Jones <address@hidden> wrote:
> > > I think the sequence should look something like this:
> > > 
> > >  1) Guest running on Host A with processor a
> > >  2) Stop guest and save its state for migration
> > >  3) Migrate guest to Host B with processor b (b is "close enough" to a)
> > >  4) Restore guest state after migration
> > >     If guest is running with '-cpu host'
> > >       4.a) Inform KVM of any configuration that impacts invariant 
> > > registers
> > >       4.b) Update the guest's view of all invariant registers to match the
> > >            host
> > >     EndIf
> > >  5) Run guest
> > > 
> > > 4.a and 4.b require new code both in QEMU and KVM. 4.a may require a new
> > > KVM API, unless the existing API can be leveraged.
> > > 
> > > The definition of "close enough" is left to the users and/or higher layers
> > > of the Virt stack.
> > > 
> > 
> > Thanks for detailing the sequence. 
> > I got another comment from Juan and David which is not to use -cpu host, 
> > see below
> > 
> > "I really think that the right approach here is not using -cpu host.  You
> > do the full work, create a model as David says, and be sure that you car
> > run that model on both cpus.  It is a lot of work, but it is the only
> > way to make sure that this is going to work long term.”
> > 
> > Not using -cpu host is orthogonal to the sequence we have been discussing 
> > in this thread.
> > Use something like -cpu cortex-a57 (this however does not work so far)
> > This would avoid close-enough definition, but would need substantial work.
> > 
> > So which approach should be taken here, whats your take...
> > 
> 
> Inventing a base-AArch64 cpu model that can then be extended with optional
> features is a nice way to extend the migratability of a guest, however
> it's hard to do because of errata. Since errata workarounds are enabled
> per MIDR, then we'd need to invent our own MIDR and also some way to
> communicate which errata we want to enable, possibly through some paravirt
> mechanism or through some implementation defined system registers that
> KVM would need to reserve and define.
> 
> That's not just a ton of work for the entire virt stack (not just KVM and
> QEMU, but also all the layers above), but it's possible that it won't be
> useful in the end anyway. There's risk that enabling just one erratum
> workaround would restrict the guest to hosts of the exact same type
> anyway. For each erratum that needs to be enabled, the probability of
> enabling an incompatible one goes up, so it may not be likely to do much
> better than '-cpu host' in the end. I'm afraid that until errata are
> primarily showing up in optional CPU features that can simply be disabled
> for the workaround, that we're stuck with '-cpu host'. I'd be happy to
> discuss it more though.
> 
> In short, I'd go with the proposal above, for now, with possibly one
> change. libvirt folk (Andrea Bolognani and Pino Toscano) suggest that
> the guest invariant register updating on the destination host only be
> done if the user opts-in to it. This is because right now if a user
> tries to migrate to a host that is not 100% identical the migration
> will fail, which makes the "mistake" clear. If we silently change the
> behavior to allow it, then what could have been a mistake, because
> the hosts aren't actually "close enough", may go unnoticed. I'm not
> 100% sure we need another user opt-in flag to be set, though, as I
> think the '-cpu host' indicates the user expects the VCPU to look
> like the host CPU, and even after migration that expectation should be
> met. Simply, users that migrate '-cpu host' VMs need to know what they're
> doing.

The problem here is that in the x86 world we've said:
  'don't use -cpu host
   if you're using it be careful and you need to know what you're doing'

on ARM it's looking like the default; that means lots more people will
use it who don't know what they're doing.

At the very least you need to log cases like this (to stderr) when you
detect them on the destination, so that when we get the innevitable
'guest crashed after migration' reports we can see the problem in the
log.

Dave

> Thanks,
> drew
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]