qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Default for phys-addr-bits? (was Re: [PATCH 4/5] x86: A


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] Default for phys-addr-bits? (was Re: [PATCH 4/5] x86: Allow physical address bits to be set)
Date: Thu, 30 Jun 2016 11:59:09 +0100
User-agent: Mutt/1.6.1 (2016-04-27)

* Gerd Hoffmann (address@hidden) wrote:
>   Hi,
> 
> > Something somewhere in qemu/ kernel/ firmware is already reading the number
> > of physical bits to determine PCI mapping; if I do:
> > 
> > ./x86_64-softmmu/qemu-system-x86_64 -m 4096,slots=16,maxmem=128T
> 
> No, it's not the physbits.  You add some memory hotplug slots here.
> Qemu will ask seabios to reserve address space for those, which seabios
> promptly does and maps 64bit pci bars above the reserved address space.

Right, that's what I was trying to do - I wanted to see if I could get something
to use the non-existing address space.

> >    -vga none -device 
> > qxl-vga,bus=pcie.0,ram_size_mb=2048,vram64_size_mb=2048 -vnc 0.0.0.0:0 
> > /home/vms/7.2a.qcow2 -chardev stdio,mux=on,id=mon -mon 
> > chardev=mon,mode=readline -cpu host,phys-bits=48
> > 
> > it will happily map the qxl VRAM right up high, but if I lower
> > the phys-bits down to 46 it won't.
> 
> I suspect the linux kernel remaps the bar because the seabios mapping is
> unreachable.  Check dmesg.

Right, and that is dependent on physbits; if I run with:

./x86_64-softmmu/qemu-system-x86_64 -machine q35,accel=kvm,usb=off -m 
4096,slots=16,maxmem=128T   -vga none -device 
qxl-vga,bus=pcie.0,ram_size_mb=2048,vram64_size_mb=2048 -vnc 0.0.0.0:0 
/home/vms/7.2a.qcow2 -chardev stdio,mux=on,id=mon -mon 
chardev=mon,mode=readline -cpu host,phys-bits=48
  (on a 46 bit xeon) it happily maps that 64-bit bar into somewhere
that shouldn't be accessible:

[    0.266183] pci_bus 0000:00: root bus resource [mem 
0x800480000000-0x8004ffffffff]
[    0.321611] pci 0000:00:02.0: reg 0x20: [mem 0x800480000000-0x8004ffffffff 
64bit pref]
[    0.423257] pci_bus 0000:00: resource 8 [mem 0x800480000000-0x8004ffffffff]

lspci -v:

00:02.0 VGA compatible controller: Red Hat, Inc. QXL paravirtual graphic card 
(rev 04) (prog-if 00 [VGA controller])
        Subsystem: Red Hat, Inc QEMU Virtual Machine
        Flags: fast devsel, IRQ 22
        Memory at c0000000 (32-bit, non-prefetchable) [size=512M]
        Memory at e0000000 (32-bit, non-prefetchable) [size=64M]
        Memory at e4070000 (32-bit, non-prefetchable) [size=8K]
        I/O ports at c080 [size=32]
        Memory at 800480000000 (64-bit, prefetchable) [size=2G]
        Expansion ROM at e4060000 [disabled] [size=64K]
        Kernel driver in use: qxl

So that's mapped at an address beyond host phys-bits.
And it hasn't failed/crashed etc - but I guess maybe nothing is using that 2G 
space?

If I change the phys-bits=48 to 46 the kernel avoids it:
    [    0.414867] acpi PNP0A08:00: host bridge window 
[0x800480000000-0x8004ffffffff] (ignored, not CPU addressable)
    [    0.683134] pci 0000:00:02.0: can't claim BAR 4 [mem 
0x800480000000-0x8004ffffffff 64bit pref]: no compatible bridge window
    [    0.703948] pci 0000:00:02.0: BAR 4: [mem size 0x80000000 64bit pref] 
conflicts with PCI mem [mem 0x00000000-0x3fffffffffff]
    [    0.703951] pci 0000:00:02.0: BAR 4: failed to assign [mem size 
0x80000000 64bit pref]

lspci shows:
    Memory at <ignored> (64-bit, prefetchable)

(Although interesting qemu's info pci still shows it).

The 'ignored, not CPU addressable' comes from the kernel's 
drivers/acpi/pci_root.c acpi_pci_root_validate_resources
that uses a value set in arch/x86/kernel/setup.c:
    iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;

So at least the Linux kernel does sanity check using the phys_bits value.

Obviously 128T is a bit silly for maxmem at the moment, however I was worrying 
what
happens with 36/39/40bit hosts, and it's not unusual to pick a maxmem that's a 
few TB
even if the VMs you're initially creating are only a handful of GB. (oVirt/RHEV 
seems to use
a 4TB default for maxmem).

Still, this only hits as a problem if you hit the combination of:
   a) You use large PCI bars
   b) On a 36/39/40bit host
   c) With a large maxmem that forces those PCI bars up to something silly.

Dave

> 
> cheers,
>   Gerd
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]