qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] 64-bit MMIO aperture expansion


From: Laszlo Ersek
Subject: [Qemu-devel] 64-bit MMIO aperture expansion
Date: Thu, 20 Sep 2018 16:49:54 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

Hi Marcel,

this email should actually be an RFC patch. But RFC patches tend to turn
into real PATCHes (if the submitter is lucky, that is), and I can't
really promise sending multiple versions of a PATCH at this time. So
please consider this a "maybe bug report".

In commit 9fa99d2519cb ("hw/pci-host: Fix x86 Host Bridges 64bit PCI
hole", 2017-11-16) we added logic so that QEMU expand the 64-bit PCI
hole (for hotplug purposes), if (a) the firmware doesn't "configure" one
(via programming individual BARs with 64-bit addresses), or (b) the
firmware's programming results in an aperture smaller than we'd like
(32GB on Q35).

We made sure that the aperture required by the firmware's programming
would never be shrunken or otherwise truncated by QEMU, so that's fine.
However, the expansion doesn't work as "widely" in all cases as it
should.

Consider the following three functions, at current master (= commit
19b599f7664b):

[hw/i386/pc.c]

> /*
>  * The 64bit pci hole starts after "above 4G RAM" and
>  * potentially the space reserved for memory hotplug.
>  */
> uint64_t pc_pci_hole64_start(void)
> {
>     PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>     MachineState *ms = MACHINE(pcms);
>     uint64_t hole64_start = 0;
>
>     if (pcmc->has_reserved_memory && ms->device_memory->base) {
>         hole64_start = ms->device_memory->base;
>         if (!pcmc->broken_reserved_end) {
>             hole64_start += memory_region_size(&ms->device_memory->mr);
>         }
>     } else {
>         hole64_start = 0x100000000ULL + pcms->above_4g_mem_size;
>     }
>
>     return ROUND_UP(hole64_start, 1 * GiB);
> }

[hw/pci-host/q35.c]

> /*
>  * The 64bit PCI hole start is set by the Guest firmware
>  * as the address of the first 64bit PCI MEM resource.
>  * If no PCI device has resources on the 64bit area,
>  * the 64bit PCI hole will start after "over 4G RAM" and the
>  * reserved space for memory hotplug if any.
>  */
> static void q35_host_get_pci_hole64_start(Object *obj, Visitor *v,
>                                           const char *name, void *opaque,
>                                           Error **errp)
> {
>     PCIHostState *h = PCI_HOST_BRIDGE(obj);
>     Q35PCIHost *s = Q35_HOST_DEVICE(obj);
>     Range w64;
>     uint64_t value;
>
>     pci_bus_get_w64_range(h->bus, &w64);
>     value = range_is_empty(&w64) ? 0 : range_lob(&w64);
>     if (!value && s->pci_hole64_fix) {
>         value = pc_pci_hole64_start();
>     }
>     visit_type_uint64(v, name, &value, errp);
> }
>
> /*
>  * The 64bit PCI hole end is set by the Guest firmware
>  * as the address of the last 64bit PCI MEM resource.
>  * Then it is expanded to the PCI_HOST_PROP_PCI_HOLE64_SIZE
>  * that can be configured by the user.
>  */
> static void q35_host_get_pci_hole64_end(Object *obj, Visitor *v,
>                                         const char *name, void *opaque,
>                                         Error **errp)
> {
>     PCIHostState *h = PCI_HOST_BRIDGE(obj);
>     Q35PCIHost *s = Q35_HOST_DEVICE(obj);
>     uint64_t hole64_start = pc_pci_hole64_start();
>     Range w64;
>     uint64_t value, hole64_end;
>
>     pci_bus_get_w64_range(h->bus, &w64);
>     value = range_is_empty(&w64) ? 0 : range_upb(&w64) + 1;
>     hole64_end = ROUND_UP(hole64_start + s->mch.pci_hole64_size, 1ULL << 30);
>     if (s->pci_hole64_fix && value < hole64_end) {
>         value = hole64_end;
>     }
>     visit_type_uint64(v, name, &value, errp);
> }
>

Now consider the following scenario:

- the firmware programs some BARs with 64-bit addresses such that the
  aperture that we deduce starts at 32GB,

- the guest has 4GB of RAM, and no DIMM hotplug range.

Consequences:

- Because the "32-bit RAM split" for Q35 is at 2GB, the
  pc_pci_hole64_start() function will return 6GB.

- The q35_host_get_pci_hole64_start() function will return 32GB. (It
  will not fall back to pc_pci_hole64_start() -- correctly -- because
  the firmware has programmed some BARs with 64-bit addresses.)

- The q35_host_get_pci_hole64_end() function *intends* to return 64GB,
  because -- let's say -- the guest assigned BARs covering the
  32GB..34GB range, which is 2GB in size, and we *intend* to round that
  size up to 32GB, so that 30GB be left for hotplug purposes. (This is
  the original intent of commit 9fa99d2519cb.)

- However, because we initialize "hole64_start" from
  pc_pci_hole64_start(), and not from q35_host_get_pci_hole64_start(),
  we add "mch.pci_hole64_size" (32GB by default) to 6GB (the end of
  RAM), and not to 32GB (the aperture base deduced from the firmware's
  programming). As a result, we'll extend the aperture end address only
  to 38GB, and not to 64GB.

My suggestion is simply to initialize "hole64_start" from
q35_host_get_pci_hole64_start(), in the q35_host_get_pci_hole64_end()
function. If the firmware doesn't program 64-bit addresses, then this
change is a no-op -- q35_host_get_pci_hole64_start() will fall back to
pc_pci_hole64_start() in that case. Otherwise, the behavior will be
fixed.

Now, there's another complication, obviously -- machine type compat. In
commit 9fa99d2519cb, we added the "pci_hole64_fix" compat property. I
assume the additional fix I'm proposing requires another compat
property? Something like:

> diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
> index 02f95765880a..c02b128189cd 100644
> --- a/hw/pci-host/q35.c
> +++ b/hw/pci-host/q35.c
> @@ -138,12 +138,14 @@ static void q35_host_get_pci_hole64_end(Object *obj, 
> Visitor *v,
>  {
>      PCIHostState *h = PCI_HOST_BRIDGE(obj);
>      Q35PCIHost *s = Q35_HOST_DEVICE(obj);
> -    uint64_t hole64_start = pc_pci_hole64_start();
> +    uint64_t hole64_start;
>      Range w64;
>      uint64_t value, hole64_end;
>
>      pci_bus_get_w64_range(h->bus, &w64);
>      value = range_is_empty(&w64) ? 0 : range_upb(&w64) + 1;
> +    hole64_start = s->pci_hole64_fix2 ? q35_host_get_pci_hole64_start() :
> +                                        pc_pci_hole64_start();
>      hole64_end = ROUND_UP(hole64_start + s->mch.pci_hole64_size, 1ULL << 30);
>      if (s->pci_hole64_fix && value < hole64_end) {
>          value = hole64_end;

The same would apply to i440fx too.

What do you think?

Thanks
Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]