qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hw/misc/vmfwupdate: Introduce hypervisor fw-cfg interface su


From: Ani Sinha
Subject: Re: [PATCH] hw/misc/vmfwupdate: Introduce hypervisor fw-cfg interface support
Date: Tue, 10 Dec 2024 08:58:20 +0530

On Mon, Dec 2, 2024 at 1:17 PM Ani Sinha <anisinha@redhat.com> wrote:
>
>
>
> > On 29 Nov 2024, at 3:42 PM, Philippe Mathieu-Daudé <philmd@linaro.org> 
> > wrote:
> >
> > On 29/11/24 10:16, Ani Sinha wrote:
> >> VM firmware update is a mechanism where the virtual machines can use their
> >> preferred and trusted firmware image in their execution environment without
> >> having to depend on a untrusted party to provide the firmware bundle. This 
> >> is
> >> particularly useful for confidential virtual machines that are deployed in 
> >> the
> >> cloud where the tenant and the cloud provider are two different entities. 
> >> In
> >> this scenario, virtual machines can bring their own trusted firmware image
> >> bundled as a part of their filesystem (using UKIs for example[1]) and then 
> >> use
> >> this hypervisor interface to update to their trusted firmware image. This 
> >> also
> >> allows the guests to have a consistent measurements on the firmware image.
> >> This change introduces basic support for the fw-cfg based hypervisor 
> >> interface
> >> and the corresponding device. The change also includes the
> >> specification document for this interface. The interface is made generic
> >> enough so that guests are free to use their own ABI to pass required
> >> information between initial and trusted execution contexts (where they are
> >> running their own trusted firmware image) without the hypervisor getting
> >> involved in between. In subsequent patches, we will introduce other minimal
> >> changes on the hypervisor that are required to make the mechanism work.
> >> [1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
> >> and https://github.com/systemd/systemd/pull/35281 for some discussions on
> >> how we can bundle firmware image within an UKI.
> >> CC: Alex Graf <graf@amazon.com>
> >> CC: Paolo Bonzini <pbonzini@redhat.com>
> >> CC: Gerd Hoffman <kraxel@redhat.com>
> >> CC: Igor Mammedov <imammedo@redhat.com>
> >> CC: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> Signed-off-by: Ani Sinha <anisinha@redhat.com>

I know we are in code freeze but I would appreciate any more feedback
on this patch so that when the freeze lifts, we may merge it.

> >> ---
> >>  MAINTAINERS                  |   9 +++
> >>  docs/specs/index.rst         |   1 +
> >>  docs/specs/vmfwupdate.rst    | 109 +++++++++++++++++++++++++
> >>  hw/misc/meson.build          |   2 +
> >>  hw/misc/vmfwupdate.c         | 152 +++++++++++++++++++++++++++++++++++
> >>  include/hw/misc/vmfwupdate.h | 103 ++++++++++++++++++++++++
> >>  6 files changed, 376 insertions(+)
> >>  create mode 100644 docs/specs/vmfwupdate.rst
> >>  create mode 100644 hw/misc/vmfwupdate.c
> >>  create mode 100644 include/hw/misc/vmfwupdate.h
> >> diff --git a/MAINTAINERS b/MAINTAINERS
> >> index 095420f8b0..cd4135fb5b 100644
> >> --- a/MAINTAINERS
> >> +++ b/MAINTAINERS
> >> @@ -2531,6 +2531,15 @@ F: include/hw/acpi/vmgenid.h
> >>  F: docs/specs/vmgenid.rst
> >>  F: tests/qtest/vmgenid-test.c
> >>  +VM Firmware Update
> >> +M: Ani Sinha <anisinha@redhat.com>
> >> +M: Alex Graf <graf@amazon.com>
> >> +M: Paolo Bonzini <pbonzini@redhat.com>
> >> +S: Maintained
> >> +F: hw/misc/vmfwupdate.c
> >> +F: include/hw/misc/vmfwupdate.h
> >> +F: docs/specs/vmfwupdate.rst
> >> +
> >>  LED
> >>  M: Philippe Mathieu-Daudé <philmd@linaro.org>
> >>  S: Maintained
> >> diff --git a/docs/specs/index.rst b/docs/specs/index.rst
> >> index ff5a1f03da..cbda7e0398 100644
> >> --- a/docs/specs/index.rst
> >> +++ b/docs/specs/index.rst
> >> @@ -34,6 +34,7 @@ guest hardware that is specific to QEMU.
> >>     virt-ctlr
> >>     vmcoreinfo
> >>     vmgenid
> >> +   vmfwupdate
> >>     rapl-msr
> >>     rocker
> >>     riscv-iommu
> >> diff --git a/docs/specs/vmfwupdate.rst b/docs/specs/vmfwupdate.rst
> >> new file mode 100644
> >> index 0000000000..3a36ca14c7
> >> --- /dev/null
> >> +++ b/docs/specs/vmfwupdate.rst
> >> @@ -0,0 +1,109 @@
> >> +VMFWUPDATE INTERFACE SPECIFICATION
> >> +##################################
> >> +
> >> +Introduction
> >> +************
> >> +
> >> +``Vmfwupdate`` is an extension to ``fw-cfg`` that allows guests to 
> >> replace early boot
> >> +code in their virtual machine. Through a combination of vmfwupdate and
> >> +hypervisor stack knowledge, guests can deterministically replace the 
> >> launch
> >> +payload for guests. This is useful for environments like SEV-SNP where the
> >> +launch payload becomes the launch digest. Guests can use vmfwupdate to 
> >> provide
> >> +a measured, full guest payload (BIOS image, kernel, initramfs, kernel
> >> +command line) to the virtual machine which enables them to easily reason 
> >> about
> >> +integrity of the resulting system.
> >> +For more information, please see the `KVM Forum 2024 presentation 
> >> <KVMFORUM_>`__
> >> +about this work from the authors [1]_.
> >> +
> >> +
> >> +.. _KVMFORUM: https://www.youtube.com/watch?v=VCMBxU6tAto
> >> +
> >> +Base Requirements
> >> +*****************
> >> +
> >> +#. **fw-cfg**:
> >> +     The target system must provide a ``fw-cfg`` interface. For x86 based
> >> +     environments, this ``fw-cfg`` interface must be accessible through 
> >> PIO ports
> >> +     0x510 and 0x511. The ``fw-cfg`` interface does not need to be 
> >> announced as part
> >> +     of system device tables such as DSDT. The ``fw-cfg`` interface must 
> >> support the
> >> +     DMA interface. It may only support the DMA interface for write 
> >> operations.
> >> +
> >> +#. **BIOS region**:
> >> +     The hypervisor must provide a BIOS region which may be
> >> +     statically sized. Through vmfwupdate, the guest is able to 
> >> atomically replace
> >> +     its contents. The BIOS region must be mapped as read-write memory. 
> >> In a
> >> +     SEV-SNP environment, the BIOS region must be mapped as private 
> >> memory at
> >> +     launch time.
> >> +
> >> +Fw-cfg Files
> >> +************
> >> +
> >> +Guests drive vmfwupdate through special ``fw-cfg`` files that control its 
> >> flow
> >> +followed by a standard system reset operation. When vmfwupdate is 
> >> available,
> >> +it provides the following ``fw-cfg`` files:
> >> +
> >> +* ``vmfwupdate/cap`` (``u64``) - Read-only Little Endian encoded bitmap 
> >> of additional
> >> +  capabilities the interface supports. List of available capabilities:
> >> +
> >> +     ``VMFWUPDATE_CAP_BIOS_RESIZE        0x0000000000000001``
> >> +
> >> +* ``vmfwupdate/bios-size`` (``u32``) - Little Endian encoded size of the 
> >> BIOS region.
> >> +  Read-only by default. Optionally Read-write if ``vmfwupdate/cap`` 
> >> contains
> >> +  ``VMFWUPDATE_CAP_BIOS_RESIZE``. On write, the BIOS region may resize. 
> >> Guests are
> >> +  required to read the value after writing and compare it with the 
> >> requested size
> >> +  to determine whether the resize was successful. Note, x86 BIOS regions 
> >> always
> >> +  start at 4GiB - bios-size.
> >> +
> >> +* ``vmfwupdate/opaque`` (``1024 bytes``) - A 1KiB buffer that survives 
> >> the BIOS replacement
> >> +  flow. Can be used by the guest to propagate guest physical addresses of 
> >> payloads
> >> +  to its BIOS stage. It’s recommended to make the new BIOS clear this 
> >> file on boot
> >> +  if it exists. Contents of this file are under control by the 
> >> hypervisor. In an
> >> +  environment that considers the hypervisor outside of its trust 
> >> boundary, guests
> >> +  are advised to validate its contents before consumption.
> >> +
> >> +* ``vmfwupdate/disable`` (``u8``) - Indicates whether the interface is 
> >> disabled.
> >> +  Returns 0 for enabled, 1 for disabled. Writing any value disables it. 
> >> Writing is
> >> +  only allowed if the value is 0. When the interface is disabled, the 
> >> replace file
> >> +  is ignored on reset. This value resets to 0 on system reset.
> >> +
> >> +* ``vmfwupdate/bios-addr`` (``u64``) - A 64bit Little Endian encoded 
> >> guest physical address
> >> +  at the beginning of the replacement BIOS region. The provided payload 
> >> must reside
> >> +  in shared memory. 0 on system reset.
> >> +
> >> +
> >> +Triggering the Firmware Update
> >> +******************************
> >> +
> >> +To initiate the firmware update process, the guest issues a standard 
> >> system reset
> >> +operation through any of the means implemented by the machine model.
> >> +
> >> +On reset, the hypervisor evaluates whether ``vmfwupdate/disable`` is 
> >> ``1``. If it is, it ignores
> >> +any other vmfwupdate values and performs a standard system reset.
> >> +
> >> +If ``vmfwupdate/disable`` is ``0``, the hypervisor checks if bios-addr is 
> >> ``0``. If it is, it
> >> +performs a standard system reset.
> >> +
> >> +If ``vmfwupdate/bios-addr`` is ``non-0``, the hypervisor replaces the 
> >> contents of the system’s
> >> +BIOS region with the guest physically contiguous ``vmfwupdate/bios-size`` 
> >> sized payload at the
> >> +guest physical address address vmfwupdate/bios-addr.
> >> +
> >> +As part of the reset operation, all existing guest shared memory as well 
> >> as the
> >> +``vmfwupdate/opaque`` file are preserved. CPU and device state are reset 
> >> to the default
> >> +hypervisor specific reset states. In SEV-SNP environments, the reset 
> >> causes recreation
> >> +of the VM context which triggers a fresh measurement of the replaced BIOS 
> >> region and
> >> +reset CPU state. The guest always resumes operation in the highest 
> >> privileged mode
> >> +available to it (VMPL0 in SEV-SNP).
> >> +
> >> +Closing Remarks
> >> +***************
> >> +The handover protocol (format of the ``vmwupdate/opaque`` file etc.) will 
> >> be implemented by
> >> +the firmware loader and firmware image, both provided by the guest.  The 
> >> hypervisor does
> >> +not need to know these details, so it is not included in this 
> >> specification.
> >> +
> >> +
> >> +
> >> +Footnotes:
> >> +^^^^^^^^^^
> >> +.. [1] Original author of the specification: *Alex Graf 
> >> <graf@amazon.com>*,
> >> +       converted to re-structured-text (rst format) and slightly edited
> >> +       by *Ani Sinha <anisinha@redhat.com>*.
> >> diff --git a/hw/misc/meson.build b/hw/misc/meson.build
> >> index d02d96e403..4c5bdb0de2 100644
> >> --- a/hw/misc/meson.build
> >> +++ b/hw/misc/meson.build
> >> @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: 
> >> files('mac_via.c'))
> >>  specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true: files('mips_cmgcr.c', 
> >> 'mips_cpc.c'))
> >>  specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true: files('mips_itu.c'))
> >>  +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true: 
> >> files('vmfwupdate.c'))
> >> +
> >>  system_ss.add(when: 'CONFIG_SBSA_REF', if_true: files('sbsa_ec.c'))
> >>    # HPPA devices
> >> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> >> new file mode 100644
> >> index 0000000000..39fac68cbe
> >> --- /dev/null
> >> +++ b/hw/misc/vmfwupdate.c
> >> @@ -0,0 +1,152 @@
> >> +/*
> >> + * Guest driven VM boot component update device
> >> + * For details and specification, please look at 
> >> docs/specs/vmfwupdate.rst.
> >> + *
> >> + * Copyright (C) 2024 Red Hat, Inc.
> >> + *
> >> + * Authors: Ani Sinha <anisinha@redhat.com>
> >> + *
> >> + * This work is licensed under the terms of the GNU GPL, version 2 or 
> >> later.
> >> + * See the COPYING file in the top-level directory.
> >> + *
> >> + */
> >> +
> >> +#include "qemu/osdep.h"
> >> +#include "qapi/error.h"
> >> +#include "qemu/module.h"
> >> +#include "sysemu/reset.h"
> >> +#include "hw/nvram/fw_cfg.h"
> >> +#include "hw/i386/pc.h"
> >> +#include "hw/qdev-properties.h"
> >> +#include "hw/misc/vmfwupdate.h"
> >> +#include "qemu/error-report.h"
> >> +
> >> +static void fw_update_reset(void *dev)
> >> +{
> >> +    /* a NOOP at present */
> >> +    return;
> >> +}
> >> +
> >> +
> >> +static uint64_t get_max_fw_size(void)
> >> +{
> >> +    Object *m_obj = qdev_get_machine();
> >> +    PCMachineState *pcms = PC_MACHINE(m_obj);
> >> +
> >> +    if (pcms) {
> >> +        return pcms->max_fw_size;
> >> +    } else {
> >> +        return 0;
> >
> > Isn't it a configuration error?
>
> It isn’t if we do not expose VMFWUPDATE_CAP_BIOS_RESIZE capability to other 
> machines. I will fix this in v2.
> Also I am not sure what is the consistent way to get this value for non-pc 
> machines.
>
> >
> >> +    }
> >> +}
> >> +
> >> +static void fw_blob_write(void *dev, off_t offset, size_t len)
> >> +{
> >> +    VMFwUpdateState *s = VMFWUPDATE(dev);
> >> +
> >> +    /*
> >> +     * in order to change the bios size, appropriate capability
> >> +       must be enabled
> >> +    */
> >> +    if (s->fw_blob.bios_size &&
> >> +        !(s->capability & VMFWUPDATE_CAP_BIOS_RESIZE)) {
> >> +        warn_report("vmfwupdate: VMFWUPDATE_CAP_BIOS_RESIZE not enabled");
> >> +        return;
> >> +    }
> >> +
> >> +    s->plat_bios_size = s->fw_blob.bios_size;
> >> +
> >> +    return;
> >> +}
> >> +
> >> +static void vmfwupdate_realize(DeviceState *dev, Error **errp)
> >> +{
> >> +    VMFwUpdateState *s = VMFWUPDATE(dev);
> >> +    FWCfgState *fw_cfg = fw_cfg_find();
> >> +
> >> +    /* multiple devices are not supported */
> >> +    if (!vmfwupdate_find()) {
> >> +        error_setg(errp, "at most one %s device is permitted",
> >> +                   TYPE_VMFWUPDATE);
> >> +        return;
> >> +    }
> >> +
> >> +    /* fw_cfg with DMA support is necessary to support this device */
> >> +    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
> >> +        error_setg(errp, "%s device requires fw_cfg",
> >> +                   TYPE_VMFWUPDATE);
> >> +        return;
> >> +    }
> >> +
> >> +    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
> >> +    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
> >> +
> >> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
> >> +                             NULL, NULL, s,
> >> +                             &s->opaque_blobs,
> >> +                             sizeof(s->opaque_blobs),
> >> +                             false);
> >> +
> >> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
> >> +                             NULL, fw_blob_write, s,
> >> +                             &s->fw_blob,
> >> +                             sizeof(s->fw_blob),
> >> +                             false);
> >> +
> >> +    /*
> >> +     * Add global capability fw_cfg file. This will be used by the guest 
> >> to
> >> +     * check capability of the hypervisor.
> >> +     */
> >> +    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
> >> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
> >> +                    &s->capability, sizeof(s->capability));
> >> +
> >> +    s->plat_bios_size = get_max_fw_size();
> >> +    /* size of bios region for the platform - read only by the guest */
> >> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
> >> +                    &s->plat_bios_size, sizeof(s->plat_bios_size));
> >> +    /*
> >> +     * add fw cfg control file to disable the hypervisor interface.
> >> +     */
> >> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
> >> +                             NULL, NULL, s,
> >> +                             &s->disable,
> >> +                             sizeof(s->disable),
> >> +                             false);
> >> +    /*
> >> +     * This device requires to register a global reset because it is
> >> +     * not plugged to a bus (which, as its QOM parent, would reset it).
> >> +     */
> >> +    qemu_register_reset(fw_update_reset, dev);
> >> +}
> >> +
> >> +static Property vmfwupdate_properties[] = {
> >> +    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
> >> +    DEFINE_PROP_END_OF_LIST(),
> >> +};
> >> +
> >> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
> >> +{
> >> +    DeviceClass *dc = DEVICE_CLASS(klass);
> >> +
> >> +    /* we are not interested in migration - so no need to populate 
> >> dc->vmsd */
> >> +    dc->desc = "VM firmware blob update device";
> >> +    dc->realize = vmfwupdate_realize;
> >> +    dc->hotpluggable = false;
> >> +    device_class_set_props(dc, vmfwupdate_properties);
> >> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> >> +}
> >> +
> >> +static const TypeInfo vmfwupdate_device_info = {
> >> +    .name          = TYPE_VMFWUPDATE,
> >> +    .parent        = TYPE_DEVICE,
> >> +    .instance_size = sizeof(VMFwUpdateState),
> >> +    .class_init    = vmfwupdate_device_class_init,
> >> +};
> >> +
> >> +static void vmfwupdate_register_types(void)
> >> +{
> >> +    type_register_static(&vmfwupdate_device_info);
> >> +}
> >> +
> >> +type_init(vmfwupdate_register_types);
> >> diff --git a/include/hw/misc/vmfwupdate.h b/include/hw/misc/vmfwupdate.h
> >> new file mode 100644
> >> index 0000000000..e9229d807b
> >> --- /dev/null
> >> +++ b/include/hw/misc/vmfwupdate.h
> >> @@ -0,0 +1,103 @@
> >> +/*
> >> + * Guest driven VM boot component update device
> >> + * For details and specification, please look at 
> >> docs/specs/vmfwupdate.rst.
> >> + *
> >> + * Copyright (C) 2024 Red Hat, Inc.
> >> + *
> >> + * Authors: Ani Sinha <anisinha@redhat.com>
> >> + *
> >> + * This work is licensed under the terms of the GNU GPL, version 2 or 
> >> later.
> >> + * See the COPYING file in the top-level directory.
> >> + *
> >> + */
> >> +#ifndef VMFWUPDATE_H
> >> +#define VMFWUPDATE_H
> >> +
> >> +#include "hw/qdev-core.h"
> >> +#include "qom/object.h"
> >> +#include "qemu/units.h"
> >> +
> >> +#define TYPE_VMFWUPDATE "vmfwupdate"
> >> +
> >> +#define VMFWUPDCAPMSK  0xffff /* least significant 16 capability bits */
> >> +
> >> +#define VMFWUPDATE_CAP_EDKROM 0x08 /* bit 4 represents support for EDKROM 
> >> */
> >> +#define VMFWUPDATE_CAP_BIOS_RESIZE 0x04 /* guests may resize bios region 
> >> */
> >> +#define CAP_VMFWUPD_MASK 0x80
> >> +
> >> +#define VMFWUPDATE_OPAQUE_SIZE (1024 * MiB)
> >> +
> >> +/* fw_cfg file definitions */
> >> +#define FILE_VMFWUPDATE_OBLOB "etc/vmfwupdate/opaque-blob"
> >> +#define FILE_VMFWUPDATE_FWBLOB "etc/vmfwupdate/fw-blob"
> >> +#define FILE_VMFWUPDATE_CAP "etc/vmfwupdate/cap"
> >> +#define FILE_VMFWUPDATE_BIOS_SIZE "etc/vmfwupdate/bios-size"
> >> +#define FILE_VMFWUPDATE_CONTROL "etc/vmfwupdate/disable"
> >> +
> >> +/*
> >> + * Address and length of the guest provided firmware blob.
> >> + * The blob itself is passed using the guest shared memory to QEMU.
> >> + * This is then copied to the guest private memeory in the secure vm
> >> + * by the hypervisor.
> >> + */
> >> +typedef struct {
> >> +    uint32_t bios_size; /*
> >> +                         * this is used by the guest to update 
> >> plat_bios_size
> >> +                         * when VMFWUPDATE_CAP_BIOS_RESIZE is set.
> >> +                         */
> >> +    uint64_t bios_paddr; /*
> >> +                          * starting gpa where the blob is in shared guest
> >> +                          * memory. Cleared upon system reset.
> >> +                          */
> >> +} VMFwUpdateFwBlob;
> >> +
> >> +typedef struct VMFwUpdateState {
> >> +    DeviceState parent_obj;
> >> +
> >> +    /*
> >> +     * capabilities - 64 bits.
> >> +     * Little endian format.
> >> +     */
> >> +    uint64_t capability;
> >> +
> >> +    /*
> >> +     * size of the bios region - architecture dependent.
> >> +     * Read-only by the guest unless VMFWUPDATE_CAP_BIOS_RESIZE
> >> +     * capability is set.
> >> +     */
> >> +    uint32_t plat_bios_size;
> >> +
> >> +    /*
> >> +     * disable - disables the interface when non-zero value is written to 
> >> it.
> >> +     * Writing 0 to this file enables the interface.
> >> +     */
> >> +    uint8_t disable;
> >> +
> >> +    /*
> >> +     * The first stage boot uses this opaque blob to convey to the next 
> >> stage
> >> +     * where the next stage components are loaded. The exact structure and
> >> +     * number of entries are unknown to the hypervisor and the hypervisor
> >> +     * does not touch this memory or do any validations.
> >> +     * The contents of this memory needs to be validated by the guest and
> >> +     * must be ABI compatible between the first and second stages.
> >> +     */
> >> +    unsigned char opaque_blobs[VMFWUPDATE_OPAQUE_SIZE];
> >> +
> >> +    /*
> >> +     * firmware blob addresses and sizes. These are moved to guest
> >> +     * private memory.
> >> +     */
> >> +    VMFwUpdateFwBlob fw_blob;
> >> +} VMFwUpdateState;
> >> +
> >> +OBJECT_DECLARE_SIMPLE_TYPE(VMFwUpdateState, VMFWUPDATE);
> >> +
> >> +/* returns NULL unless there is exactly one device */
> >> +static inline VMFwUpdateState *vmfwupdate_find(void)
> >> +{
> >> +    Object *o = object_resolve_path_type("", TYPE_VMFWUPDATE, NULL);
> >> +
> >> +    return o ? VMFWUPDATE(o) : NULL;
> >> +}
> >> +
> >> +#endif
>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]