qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] A question about PCI device address spaces


From: David Gibson
Subject: Re: [Qemu-devel] A question about PCI device address spaces
Date: Fri, 23 Dec 2016 11:02:28 +1100
User-agent: Mutt/1.7.1 (2016-10-04)

On Thu, Dec 22, 2016 at 05:42:40PM +0800, Peter Xu wrote:
> Hello,
> 
> Since this is a general topic, I picked it out from the VT-d
> discussion and put it here, just want to be more clear of it.
> 
> The issue is, whether we have exposed too much address spaces for
> emulated PCI devices?
> 
> Now for each PCI device, we are having PCIDevice::bus_master_as for
> the device visible address space, which derived from
> pci_device_iommu_address_space():
> 
> AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
> {
>     PCIBus *bus = PCI_BUS(dev->bus);
>     PCIBus *iommu_bus = bus;
> 
>     while(iommu_bus && !iommu_bus->iommu_fn && iommu_bus->parent_dev) {
>         iommu_bus = PCI_BUS(iommu_bus->parent_dev->bus);
>     }
>     if (iommu_bus && iommu_bus->iommu_fn) {
>         return iommu_bus->iommu_fn(bus, iommu_bus->iommu_opaque, dev->devfn);
>     }
>     return &address_space_memory;
> }
> 
> By default (for no-iommu case), it's pointed to system memory space,
> which includes MMIO, and looks wrong - PCI device should not be able to
> write to MMIO regions.

Sorry, I've realized my earlier comments were a bit misleading.

I'm pretty sure the inbound (==DMA) window(s) will be less than the
full 64-bit address space.  However, that doesn't necessarily mean it
won't cover *any* MMIO.

Plus, of course, any MMIO that's provided by PCI (or legacy ISA)
devices - and on the PC platform, that's nearly everything - will also
be visible in PCI space, since it doesn't need to go through the
inbound window for that at all.  Strictly speaking PCI-provided MMIO
may not appear at the same address in PCI space as it does in the
system memory space, but for PC they will be.  By platform convention
the outbound windows are also identity mappings.

Part of the reason I was misleading was that I was thinking of non-PC
platforms, which often have more "native" MMIO devices on the CPU side
of of the PCI host bridge.

> As an example, if we dump a PCI device address space into detail on
> x86_64 system, we can see (this is address space for a virtio-net-pci
> device on an Q35 machine with 6G memory):
> 
>     0000000000000000-000000000009ffff (prio 0, RW): pc.ram
>     00000000000a0000-00000000000affff (prio 1, RW): vga.vram
>     00000000000b0000-00000000000bffff (prio 1, RW): vga-lowmem
>     00000000000c0000-00000000000c9fff (prio 0, RW): pc.ram
>     00000000000ca000-00000000000ccfff (prio 0, RW): pc.ram
>     00000000000cd000-00000000000ebfff (prio 0, RW): pc.ram
>     00000000000ec000-00000000000effff (prio 0, RW): pc.ram
>     00000000000f0000-00000000000fffff (prio 0, RW): pc.ram
>     0000000000100000-000000007fffffff (prio 0, RW): pc.ram
>     00000000b0000000-00000000bfffffff (prio 0, RW): pcie-mmcfg-mmio
>     00000000fd000000-00000000fdffffff (prio 1, RW): vga.vram
>     00000000fe000000-00000000fe000fff (prio 0, RW): virtio-pci-common
>     00000000fe001000-00000000fe001fff (prio 0, RW): virtio-pci-isr
>     00000000fe002000-00000000fe002fff (prio 0, RW): virtio-pci-device
>     00000000fe003000-00000000fe003fff (prio 0, RW): virtio-pci-notify
>     00000000febd0400-00000000febd041f (prio 0, RW): vga ioports remapped
>     00000000febd0500-00000000febd0515 (prio 0, RW): bochs dispi interface
>     00000000febd0600-00000000febd0607 (prio 0, RW): qemu extended regs
>     00000000febd1000-00000000febd102f (prio 0, RW): msix-table
>     00000000febd1800-00000000febd1807 (prio 0, RW): msix-pba
>     00000000febd2000-00000000febd2fff (prio 1, RW): ahci
>     00000000fec00000-00000000fec00fff (prio 0, RW): kvm-ioapic
>     00000000fed00000-00000000fed003ff (prio 0, RW): hpet
>     00000000fed1c000-00000000fed1ffff (prio 1, RW): lpc-rcrb-mmio
>     00000000fee00000-00000000feefffff (prio 4096, RW): kvm-apic-msi
>     00000000fffc0000-00000000ffffffff (prio 0, R-): pc.bios
>     0000000100000000-00000001ffffffff (prio 0, RW): pc.ram
> 
> So here are the "pc.ram" regions the only ones that we should expose
> to PCI devices? (it should contain all of them, including the low-mem
> ones and the >=4g one)
> 
> And, should this rule work for all platforms? Or say, would it be a
> problem if I directly change address_space_memory in
> pci_device_iommu_address_space() into something else, which only
> contains RAMs? (of course this won't affect any platform that has
> IOMMU, aka, customized PCIBus::iommu_fn function)

No, the arragement of both inbound and outbound windows is certainly
platform dependent (strictly speaking, dependent on the model and
configuration of the host bridge, but that tends to be tied strongly
to the platform).  I think address_space_memory is the closest
approximation we're going to get that works for multiple platforms -
having both inbound and outbound windows identity mapped is pretty
common, I believe, even if they don't strictly speaking cover the
whole address space.

> (btw, I'd appreciate if anyone has quick answer on why we have lots of
>  continuous "pc.ram" in low 2g range - from can_merge() I guess they
>  seem to have different dirty_log_mask, romd_mode, etc., but I still
>  would like to know why they are having these difference. Anyway, this
>  is totally an "optional question" just to satisfy my own curiosity :)

I don't know PC well enough to be sure, but I suspect those low
regions have special meaning for the BIOS.

Note also the large gap between the pc.ram at 1M..2G and 4G..up.  This
is the so-called "memory hole".  You'll notice that all the IO regions
are in that range - that's for backwards compatibility with
32-bit machines where there was obviously nowhere else to put them.
Many 64-bit native platforms (including PAPR) don't have such a thing
and instead have RAM contiguous at 0 and the IO well above 4G in CPU
address space.

The PC PCI host bridge must clearly have an outgoing IO window from
2G..4G (mapping to the same addresses in PCI space) to handle these
devices.  I'm pretty sure there must also be another window much
higher up, to handle 64-bit PCI devices with really big BARs (which
you probably don't have any of on your example system).

Whaet I don't know is whether the 2G..4G range in PCI space will be
specifically excluded from the incoming (DMA) windows on the host
bridge.  It might be that it is, or it might just be that the host
bridge will forward things to the CPU bus only if they don't get
picked up by a device BAR first.  And I guess it's further complicated
by the fact that on PCI-E "up-bound" and "down-bound" transactions can
be distinguished, and the fact that at least some PCI-to-PCI or
PCIe-to-PCI bridges also have configurable inbound and outbound
windows.  I'm not sure if that includes the implicit bridges in PCIe
root ports or switch ports.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]