qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edk2-devel] A problem with live migration of UEFI virtual machines


From: Andrew Fish
Subject: Re: [edk2-devel] A problem with live migration of UEFI virtual machines
Date: Thu, 27 Feb 2020 20:04:00 -0800



On Feb 26, 2020, at 1:42 AM, Laszlo Ersek <address@hidden> wrote:

Hi Andrew,

On 02/25/20 22:35, Andrew Fish wrote:

Laszlo,

The FLASH offsets changing breaking things makes sense.

I now realize this is like updating the EFI ROM without rebooting the
system.  Thus changes in how the new EFI code works is not the issue.

Is this migration event visible to the firmware? Traditionally the
NVRAM is a region in the FD so if you update the FD you have to skip
NVRAM region or save and restore it. Is that activity happening in
this case? Even if the ROM layout does not change how do you not lose
the contents of the NVRAM store when the live migration happens? Sorry
if this is a remedial question but I'm trying to learn how this
migration works.

With live migration, the running guest doesn't notice anything. This is
a general requirement for live migration (regardless of UEFI or flash).

You are very correct to ask about "skipping" the NVRAM region. With the
approach that OvmfPkg originally supported, live migration would simply
be unfeasible. The "build" utility would produce a single (unified)
OVMF.fd file, which would contain both NVRAM and executable regions, and
the guest's variable updates would modify the one file that would exist.
This is inappropriate even without considering live migration, because
OVMF binary upgrades (package updates) on the virtualization host would
force guests to lose their private variable stores (NVRAMs).

Therefore, the "build" utility produces "split" files too, in addition
to the unified OVMF.fd file. Namely, OVMF_CODE.fd and OVMF_VARS.fd.
OVMF.fd is simply the concatenation of the latter two.

$ cat OVMF_VARS.fd OVMF_CODE.fd | cmp - OVMF.fd
[prints nothing]


Laszlo,

Thanks for the detailed explanation. 

Maybe I was overcomplicating this. Given your explanation I think the part I'm missing is OVMF is implying FLASH layout, in this split model, based on the size of the OVMF_CODE.fd and OVMF_VARS.fd.  Given that if OVMF_CODE.fd gets bigger the variable address changes from a QEMU point of view. So basically it is the QEMU  API that is making assumptions about the relative layout of the FD in the split model that makes a migration to larger ROM not work. Basically the -pflash API does not support changing the size of the ROM without moving NVRAM given the way it is currently defined. 

Given the above it seems like the 2 options are:
1) Pad OVMF_CODE.fd to be very large so there is room to grow.
2) Add some feature to QUEM that allows the variable store address to not be based on OVMF_CODE.fd size. 

I did see this [1] and combined with your email I either understand, or I'm still confused? :)

I'm not saying we need to change anything, I'm just trying to make sure I understand how OVMF and QEMU are tied to together. 


Thanks,

Andrew Fish




When you define a new domain (VM) on a virtualization host, the domain
definition saves a reference (pathname) to the OVMF_CODE.fd file.
However, the OVMF_VARS.fd file (the variable store *template*) is not
directly referenced; instead, it is *copied* into a separate (private)
file for the domain.

Furthermore, once booted, guest has two flash chips, one that maps the
firmware executable OVMF_CODE.fd read-only, and another pflash chip that
maps its private varstore file read-write.

This makes it possible to upgrade OVMF_CODE.fd and OVMF_VARS.fd (via
package upgrades on the virt host) without messing with varstores that
were earlier instantiated from OVMF_VARS.fd. What's important here is
that the various constants in the new (upgraded) OVMF_CODE.fd file
remain compatible with the *old* OVMF_VARS.fd structure, across package
upgrades.

If that's not possible for introducing e.g. a new feature, then the
package upgrade must not overwrite the OVMF_CODE.fd file in place, but
must provide an additional firmware binary. This firmware binary can
then only be used by freshly defined domains (old domains cannot be
switched over). Old domains can be switched over manually -- and only if
the sysadmin decides it is OK to lose the current variable store
contents. Then the old varstore file for the domain is deleted
(manually), the domain definition is updated, and then a new (logically
empty, pristine) varstore can be created from the *new* OVMF_2_VARS.fd
that matches the *new* OVMF_2_CODE.fd.


During live migration, the "RAM-like" contents of both pflash chips are
migrated (the guest-side view of both chips remains the same, including
the case when the writeable chip happens to be in "programming mode",
i.e., during a UEFI variable write through the Fault Tolerant Write and
Firmware Volume Block(2) protocols).

Once live migration completes, QEMU dumps the full contents of the
writeable chip to the backing file (on the destination host). Going
forward, flash writes from within the guest are reflected to said
host-side file on-line, just like it happened on the source host before
live migration. If the file backing the r/w pflash chip is on NFS
(shared by both src and dst hosts), then this one-time dumping when the
migration completes is superfluous, but it's also harmless.

The interesting question is, what happens when you power down the VM on
the destination host (= post migration), and launch it again there, from
zero. In that case, the firmware executable file comes from the
*destination host* (it was never persistently migrated from the source
host, i.e. never written out on the dst). It simply comes from the OVMF
package that had been installed on the destination host, by the
sysadmin. However, the varstore pflash does reflect the permanent result
of the previous migration. So this is where things can fall apart, if
both firmware binaries (on the src host and on the dst host) don't agree
about the internal structure of the varstore pflash.

Thanks
Laszlo


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#54845): https://edk2.groups.io/g/devel/message/54845
Mute This Topic: https://groups.io/mt/71141681/1755084
Group Owner: address@hidden
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [address@hidden]
-=-=-=-=-=-=-=-=-=-=-=-



reply via email to

[Prev in Thread] Current Thread [Next in Thread]