+Ard, +Drew
On 11/03/20 13:39, Igor Mammedov wrote:
On Fri, 30 Oct 2020 10:50:01 +0800
Ying Fang <fangying1@huawei.com> wrote:
Hi,
I have a question on UEFI/ACPI tables setup and probing on arm64 platform.
CCing Laszlo,
who might know how it's implemented.
Currently on arm64 platform guest can be booted with both fdt and ACPI
supported. If ACPI is enabled, [1] says the only defined method for
passing ACPI tables to the kernel is via the UEFI system configuration
table. So AFAIK, ACPI Should be dependent on UEFI.
That's correct. The ACPI entry point (RSD PTR) on AARCH64 is defined in
terms of UEFI.
What's more [2] says UEFI kernel support on the ARM architectures
is only available through a *stub*. The stub populates the FDT /chosen
node with some UEFI parameters describing the UEFI location info.
Yes.
So i dump /sys/firmware/fdt from the guest, it does have something like:
/dts-v1/;
/ {
#size-cells = <0x02>;
#address-cells = <0x02>;
chosen {
linux,uefi-mmap-desc-ver = <0x01>;
linux,uefi-mmap-desc-size = <0x30>;
linux,uefi-mmap-size = <0x810>;
linux,uefi-mmap-start = <0x04 0x3c0ce018>;
linux,uefi-system-table = <0x04 0x3f8b0018>;
bootargs =
"BOOT_IMAGE=/vmlinuz-4.19.90-2003.4.0.0036.oe1.aarch64
root=/dev/mapper/openeuler-root ro rd.lvm.lv=openeuler/root
rd.lvm.lv=openeuler/swap video=VGA-1:640x480-32@60me
smmu.bypassdev=0x1000:0x17 smmu.bypassdev=0x1000:0x15
crashkernel=1024M,high video=efifb:off video=VGA-1:640x480-32@60me";
linux,initrd-end = <0x04 0x3a85a5da>;
linux,initrd-start = <0x04 0x392f2000>;
};
};
But the question is that I did not see any code adding the uefi
in fdt chosen node in *arm_load_dtb* or anywhere else.
That's because the "UEFI stub" is a part of the guest kernel. It wraps
the guest kernel image into a UEFI application binary. For a while, the
guest kernel runs as a UEFI application, stashing some UEFI artifacts in
*a* device tree, and then (after some other heavy lifting) jumping into
the kernel proper.
Qemu only maps the OVMF binary file into a pflash device.
So I'm really confused on how UEFI information is provided to
guest by qemu. Does anybody know of the details about it ?
It's complex, unfortunately.
(1) QEMU always generates a DTB for the guest firmware. This DTB is
placed at the base of the guest RAM.
See the arm_load_dtb() call in virt_machine_done() [hw/arm/virt.c] in
QEMU. I think.
(2) QEMU generates ACPI content, and exposes it via fw_cfg.
See the virt_acpi_setup() call in the same virt_machine_done() function
[hw/arm/virt.c] in QEMU.
(3) The fw_cfg device itself is apparent to the guest firmware via the
DTB from point (1). See the following steps in edk2:
(3a) "ArmVirtPkg/Library/PlatformPeiLib/PlatformPeiLib.c"
This saves the initial DTB (from the base of guest RAM, where it could
be overwritten by whatever) to a dynamically allocated area. This
"stashing" occurs early.
(3b) "ArmVirtPkg/FdtClientDxe/FdtClientDxe.c"
This driver exposes the (dynamically reallocated / copied) DTB via a
custom UEFI protocol to the rest of the firmware. (This happens much
later.) This protocol / driver can be considered the "owner" of the
stashed DTB from (3a).
(3c) "ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c"
This is the fw_cfg device access library, discovering the fw_cfg
registers via the above UEFI protocol. The library is linked into each
firmware module that needs fw_cfg access.
(4) The firmware interprets QEMU's DTB for actual content (parsing
values, configuring hardware, accessing devices).
This occurs in a whole bunch of locations, mostly via consuming the
custom protocol from (3b). Some info that's needed very early is parsed
out of the DTB right in step (3a).
(5) The guest firmware has a dedicated driver that checks whether QEMU
was configured with ACPI enabled or disabled, and publishes that choice
to the rest of the firmware. This is necessary because some firmware
actions / infrastructure parts cannot (must not) proceed until this
decision has been interpreted.
See in edk2:
- ArmVirtPkg/PlatformHasAcpiDtDxe
This driver keys off of the presence of the "etc/table-loader" fw_cfg
file, coming from step (2), using the fw_cfg access library from step (3c).
If ACPI was enabled on the QEMU cmdline, then the rest of the firmware
is "level-triggered" to proceed with the ACPI infrastructure.
Otherwise, the rest of the firmware is "level-triggered" that DT was
chosen for the OS.
("Level-triggering" means the installation of custom NULL protocols,
which permits drivers dependent on DT vs ACPI to be dispatched.)
(6) If DT was selected (ACPI was disabled), per step (5), then
FdtClientDxe (introduced under step (3b)) has another job: it forwards
the original stashed DTB (see (3a)) to the guest OS.
This "DTB forwarding" occurs through a particular UEFI config table; the
GUID is B1B621D5-F19C-41A5-830B-D9152C69AAE0 -- known as
DEVICE_TREE_GUID in the kernel ("include/linux/efi.h").
See the OnPlatformHasDeviceTree() function in
"ArmVirtPkg/FdtClientDxe/FdtClientDxe.c", in edk2.
(7) If ACPI was selected instead, according to step (5), then through
the fw_cfg access described in (3c), the guest firmware "blindly"
processes the ACPI payload from QEMU (from step (2)).
This "blind processing" means that the guest firmware runs the "ACPI
linker/loader script" (the "etc/table-loader" fw_cfg file), installing a
number of ACPI tables for the guest OS. The guest firmware does not
interpret the ACPI tables.
"Installing ACPI tables" ultimately means exposing stuff under the
particular UEFI config table that stands for the RSD PTR -- the GUID is
8868E871-E4F1-11D3-BC22-0080C73C8881. (Known as ACPI_20_TABLE_GUID in
Linux, "include/linux/efi.h".)
See the following in edk2:
- OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpiPlatformDxe.inf
In this case, the guest firmware does not forward QEMU's original DTB to
the guest OS.
(8) Ultimately, from the guest OS's point of view, a UEFI config table
for *either* the RSD PTR (ACPI_20_TABLE_GUID) *or* QEMU's DTB
(DEVICE_TREE_GUID) is going to exist.
(9) (Ard, please correct the below if necessary; thanks.)
The UEFI stub of the guest kernel (which is a UEFI application) uses a
device tree as its main communication channel to the (later-started)
kernel entry point, AIUI.
The UEFI stub basically inverts the importance of the UEFI system table
versus the device tree -- the UEFI stub *converts* the UEFI system table
(the multitude of UEFI config tables) into a device tree. This is my
understanding anyway.
(9a) If ACPI was disabled on the QEMU command line, then the guest
kernel *adopts* the device tree that was forwarded to it in (6), via the
UEFI config table marked with DEVICE_TREE_GUID.
(9b) If ACPI was enabled on the QEMU command line, then the UEFI stub
creates a brand new (empty) device tree (AIUI).
Either way, the UEFI system table is linked *under* the -- adopted or
new -- device tree, through the "chosen" node. And so, if ACPI was
enabled, the ACPI RSD PTR (coming from step (7)) becomes visible to the
kernel proper as well, through the UEFI config table with
ACPI_20_TABLE_GUID.
I believe this is implemented under "drivers/firmware/efi/libstub" in
the kernel tree.
Thanks,
Laszlo
.