qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [edk2] [edk2 PATCH 0/1] OvmfPkg: grab ACPI tables from


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [edk2] [edk2 PATCH 0/1] OvmfPkg: grab ACPI tables from QEMU
Date: Tue, 19 Nov 2013 22:39:06 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20131028 Thunderbird/17.0.10

On 11/19/13 22:13, Michael S. Tsirkin wrote:
> On Tue, Nov 19, 2013 at 01:31:39PM +0100, Laszlo Ersek wrote:
>> On 11/19/13 09:54, Gerd Hoffmann wrote:
>>>   Hi,
>>>
>>>>   ACPI PciWindow32: Base=0xA0000000 End=0xFEEFFFFF Length=0x5EF00000
>>>
>>>>   begin32=c0000000 end32=fec00000 begin64=0 end64=0
>>>
>>> qemu & seabios pick a 32bit window size which allows to cover it with
>>> a single mtrr entry.  Size must be a power of two for that to work.
>>>
>>>   address@hidden ~]# cat /proc/mtrr
>>>   reg00: base=0x080000000 ( 2048MB), size= 2048MB, count=1: uncachable
>>>
>>> So we have three cases for piix (as you've figured in the source code
>>> already).  Start at 0x80000000 (2G window), 0xc0000000 (1G window) and
>>> 0xe0000000 (512M window).
>>>
>>> btw: q35 has a fixed 1G window, max low ram addr is 0xb000000, the
>>> remaining address space (0xb0000000 -> 0xc0000000) is used for
>>> mmconfig.
>>>
>>>> I guess the range starting at 0xc0000000 is somehow "incompatible"
>>>> with the EFI memory map. (I can't actually explain this idea because,
>>>> again, this second range is a proper subset of the former, and its
>>>> size is still 1004MB.)
>>>
>>> Probably efi places the gfx memory bar somewhere between 0xa0000000
>>> and 0xc0000000.  Sets up efifb accordingly.  Then the linux kernel
>>> loads and figures "Oh, that bar is outside the 0xc0000000+ window" and
>>> goes remap it -> efifb writes go into nowhere.
>>
>> Thank you for the explanation.
>>
>> How do I fix it though? :) I could change the MMIO HOBs in PEI
>> (OvmfPkg/PlatformPei/Platform.c):
>>
>>   //
>>   // Video memory + Legacy BIOS region
>>   //
>>   AddIoMemoryRangeHob (0x0A0000, BASE_1MB);
>>
>>   //
>>   // address       purpose   size
>>   // ------------  --------  -------------------------
>>   // max(top, 2g)  PCI MMIO  0xFC000000 - max(top, 2g)
>>   // 0xFC000000    gap                           44 MB
>>   // 0xFEC00000    IO-APIC                        4 KB
>>   // 0xFEC01000    gap                         1020 KB
>>   // 0xFED00000    HPET                           1 KB
>>   // 0xFED00400    gap                         1023 KB
>>   // 0xFEE00000    LAPIC                          1 MB
>>   //
>>   AddIoMemoryRangeHob (TopOfMemory < BASE_2GB ? BASE_2GB : TopOfMemory, 
>> 0xFC000000);
>>   AddIoMemoryBaseSizeHob (0xFEC00000, SIZE_4KB);
>>   AddIoMemoryBaseSizeHob (0xFED00000, SIZE_1KB);
>>   AddIoMemoryBaseSizeHob (PcdGet32(PcdCpuLocalApicBaseAddress), SIZE_1MB);
>>
>> to imitate the same three cases. The HOB with the lowest address
>> produced here would affect the BAR placement as well.
>>
>> ... Yes, I tested the attached patch, and it makes the display work in
>> both 2560MB guests.
>>
>> However, I don't like the idea of hardwiring those boundaries here. (The
>> current values are also hardwired, but they are better: they are not the
>> consequence of "SeaBIOS has done it like this forever" -- inter-firmware
>> dependency, especially when they aren't each other's payloads, is bad
>> IMHO.) We'd need something dynamic here, like a memory map from qemu.
>>
>> ... Which puts us in the same boat with Wei :)
>>
>> Thanks
>> Laszlo
> 
> So the bug is really qemu being too conservative with it's ACPI
> tables, isn't it?

Here's what I think happens (and I'll shamelessly steal Gerd's explanation):

(1) OVMF configures its own memory map very early (in PEI). The ranges
configured are quite static. The only variable that goes into this
configuration is "top of RAM below 4G", which is capped at 3.5GB by qemu
(the variable comes from the CMOS):

  TopOfMemory <=  2G ---> PCI hole starts at 2G
  TopOfMemory >   2G ---> PCI hole starts at TopOfMemory
  TopOfMemory > 3.5G ---> not possible

The low bound of the 32-bit PCI hole determined thus further affects two
things (in a later phase (DXE)) of OVMF:

(2) PCI resource allocation -- in practice the video framebuffer gets
allocated right at the start of the hole,
(3) the value that OVMF's own ACPI table population code advertises in
the PCI bridge's _CRS.


              > (2) PCI resource alloc
             /
  (1) memmap
            \
             > (3) ACPI
                       \
                        > (4) guest OS access -- match (2)

So, until this patch, everything was in sync. With this patch, the low
bound exported over ACPI -- in (3) -- got replaced (whole-sale) by
qemu's ACPI table. Therefore (1) only determined (2), and a conflict
ensued between (2) and (3). (Which is where I refer back to Gerd's
explanation -- the guest OS tries to access the framebuffer according to
(3), the value in the ACPI table, but that's not where the framebuffer
has actually been configured in (2), which had keyed off (1).)

              > (2) PCI resource alloc
             /
  (1) memmap

   (X) qemu
           \
            > (3) ACPI
                      \
                       > (4) guest OS access -- should match (2),
                                                but doesn't


The *real* bug is that OVMF's memory map configuration (1) is not
*dynamically* synchronized with qemu's ACPI table population logic (X).
My patch that I attached earlier to this thread fixes this (by simply
duplicating the same "quantization" logic), but it's ugly:

              > (2) PCI resource alloc
             /
  (1) memmap
        ||
   (X) qemu
           \
            > (3) ACPI
                      \
                       > (4) guest OS access -- matches (2) by way of
                                                code duplication
                                                between (1) and (X)

What we really should implement is this:

                          > (2) PCI resource alloc
                         /
            > (1) memmap
           /
   (X) qemu
           \
            > (3) ACPI
                       \
                        > (4) guest OS access -- matches (2) by (X)
                                                 being the common root
                                                 of everything

That is, OVMF should compose its own memmap in (1) based on dynamic
information from qemu (importantly, the start address of the 32-bit PCI
hole that qemu is going to advertise later in the ACPI tables).

> The MTRRs in BIOS don't affect much: AFAIK they are completely ignored
> if it's exiting to the host, but I'm not sure about assigned
> devices.
> Alex, can you remind us all what happens there?

If you simply relax (drop) the stage-wise hole-setting, that will at
best create a situation where OVMF and qemu duplicate the logic. This
info needs to be passed down to OVMF (maybe it already is, somewhere in
fw_cfg!), in some form that's more approachable than an ACPI table. (We
absolutely don't want to parse ACPI (AML) in OVMF.)

Thanks!
Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]