[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 03/18] pci: isolated address space for PCI bus
|
From: |
Jag Raman |
|
Subject: |
Re: [PATCH v5 03/18] pci: isolated address space for PCI bus |
|
Date: |
Thu, 27 Jan 2022 17:43:46 +0000 |
> On Jan 26, 2022, at 1:13 PM, Dr. David Alan Gilbert <dgilbert@redhat.com>
> wrote:
>
> * Jag Raman (jag.raman@oracle.com) wrote:
>>
>>
>>> On Jan 25, 2022, at 1:38 PM, Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> wrote:
>>>
>>> * Jag Raman (jag.raman@oracle.com) wrote:
>>>>
>>>>
>>>>> On Jan 19, 2022, at 7:12 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>
>>>>> On Wed, Jan 19, 2022 at 04:41:52PM -0500, Jagannathan Raman wrote:
>>>>>> Allow PCI buses to be part of isolated CPU address spaces. This has a
>>>>>> niche usage.
>>>>>>
>>>>>> TYPE_REMOTE_MACHINE allows multiple VMs to house their PCI devices in
>>>>>> the same machine/server. This would cause address space collision as
>>>>>> well as be a security vulnerability. Having separate address spaces for
>>>>>> each PCI bus would solve this problem.
>>>>>
>>>>> Fascinating, but I am not sure I understand. any examples?
>>>>
>>>> Hi Michael!
>>>>
>>>> multiprocess QEMU and vfio-user implement a client-server model to allow
>>>> out-of-process emulation of devices. The client QEMU, which makes ioctls
>>>> to the kernel and runs VCPUs, could attach devices running in a server
>>>> QEMU. The server QEMU needs access to parts of the client’s RAM to
>>>> perform DMA.
>>>
>>> Do you ever have the opposite problem? i.e. when an emulated PCI device
>>
>> That’s an interesting question.
>>
>>> exposes a chunk of RAM-like space (frame buffer, or maybe a mapped file)
>>> that the client can see. What happens if two emulated devices need to
>>> access each others emulated address space?
>>
>> In this case, the kernel driver would map the destination’s chunk of
>> internal RAM into
>> the DMA space of the source device. Then the source device could write to
>> that
>> mapped address range, and the IOMMU should direct those writes to the
>> destination device.
>
> Are all devices mappable like that?
If there is an IOMMU that supports DMA-Remapping (DMAR), that would be
possible - the kernel could configure the DMAR to facilitate such mapping.
If there is no DMAR, the kernel/cpu could buffer the write between devices.
--
Jag
>
>> I would like to take a closer look at the IOMMU implementation on how to
>> achieve
>> this, and get back to you. I think the IOMMU would handle this. Could you
>> please
>> point me to the IOMMU implementation you have in mind?
>
> I didn't have one in mind; I was just hitting a similar problem on
> Virtiofs DAX.
>
> Dave
>
>> Thank you!
>> --
>> Jag
>>
>>>
>>> Dave
>>>
>>>> In the case where multiple clients attach devices that are running on the
>>>> same server, we need to ensure that each devices has isolated memory
>>>> ranges. This ensures that the memory space of one device is not visible
>>>> to other devices in the same server.
>>>>
>>>>>
>>>>> I also wonder whether this special type could be modelled like a special
>>>>> kind of iommu internally.
>>>>
>>>> Could you please provide some more details on the design?
>>>>
>>>>>
>>>>>> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>>>>>> Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
>>>>>> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
>>>>>> ---
>>>>>> include/hw/pci/pci.h | 2 ++
>>>>>> include/hw/pci/pci_bus.h | 17 +++++++++++++++++
>>>>>> hw/pci/pci.c | 17 +++++++++++++++++
>>>>>> hw/pci/pci_bridge.c | 5 +++++
>>>>>> 4 files changed, 41 insertions(+)
>>>>>>
>>>>>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>>>>>> index 023abc0f79..9bb4472abc 100644
>>>>>> --- a/include/hw/pci/pci.h
>>>>>> +++ b/include/hw/pci/pci.h
>>>>>> @@ -387,6 +387,8 @@ void pci_device_save(PCIDevice *s, QEMUFile *f);
>>>>>> int pci_device_load(PCIDevice *s, QEMUFile *f);
>>>>>> MemoryRegion *pci_address_space(PCIDevice *dev);
>>>>>> MemoryRegion *pci_address_space_io(PCIDevice *dev);
>>>>>> +AddressSpace *pci_isol_as_mem(PCIDevice *dev);
>>>>>> +AddressSpace *pci_isol_as_io(PCIDevice *dev);
>>>>>>
>>>>>> /*
>>>>>> * Should not normally be used by devices. For use by sPAPR target
>>>>>> diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
>>>>>> index 347440d42c..d78258e79e 100644
>>>>>> --- a/include/hw/pci/pci_bus.h
>>>>>> +++ b/include/hw/pci/pci_bus.h
>>>>>> @@ -39,9 +39,26 @@ struct PCIBus {
>>>>>> void *irq_opaque;
>>>>>> PCIDevice *devices[PCI_SLOT_MAX * PCI_FUNC_MAX];
>>>>>> PCIDevice *parent_dev;
>>>>>> +
>>>>>> MemoryRegion *address_space_mem;
>>>>>> MemoryRegion *address_space_io;
>>>>>>
>>>>>> + /**
>>>>>> + * Isolated address spaces - these allow the PCI bus to be part
>>>>>> + * of an isolated address space as opposed to the global
>>>>>> + * address_space_memory & address_space_io.
>>>>>
>>>>> Are you sure address_space_memory & address_space_io are
>>>>> always global? even in the case of an iommu?
>>>>
>>>> On the CPU side of the Root Complex, I believe address_space_memory
>>>> & address_space_io are global.
>>>>
>>>> In the vfio-user case, devices on the same machine (TYPE_REMOTE_MACHINE)
>>>> could be attached to different clients VMs. Each client would have their
>>>> own address
>>>> space for their CPUs. With isolated address spaces, we ensure that the
>>>> devices
>>>> see the address space of the CPUs they’re attached to.
>>>>
>>>> Not sure if it’s OK to share weblinks in this mailing list, please let me
>>>> know if that’s
>>>> not preferred. But I’m referring to the terminology used in the following
>>>> block diagram:
>>>> https://en.wikipedia.org/wiki/Root_complex#/media/File:Example_PCI_Express_Topology.svg
>>>>
>>>>>
>>>>>> This allows the
>>>>>> + * bus to be attached to CPUs from different machines. The
>>>>>> + * following is not used used commonly.
>>>>>> + *
>>>>>> + * TYPE_REMOTE_MACHINE allows emulating devices from multiple
>>>>>> + * VM clients,
>>>>>
>>>>> what are VM clients?
>>>>
>>>> It’s the client in the client - server model explained above.
>>>>
>>>> Thank you!
>>>> --
>>>> Jag
>>>>
>>>>>
>>>>>> as such it needs the PCI buses in the same machine
>>>>>> + * to be part of different CPU address spaces. The following is
>>>>>> + * useful in that scenario.
>>>>>> + *
>>>>>> + */
>>>>>> + AddressSpace *isol_as_mem;
>>>>>> + AddressSpace *isol_as_io;
>>>>>> +
>>>>>> QLIST_HEAD(, PCIBus) child; /* this will be replaced by qdev later */
>>>>>> QLIST_ENTRY(PCIBus) sibling;/* this will be replaced by qdev later */
>>>>>>
>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>> index 5d30f9ca60..d5f1c6c421 100644
>>>>>> --- a/hw/pci/pci.c
>>>>>> +++ b/hw/pci/pci.c
>>>>>> @@ -442,6 +442,8 @@ static void pci_root_bus_internal_init(PCIBus *bus,
>>>>>> DeviceState *parent,
>>>>>> bus->slot_reserved_mask = 0x0;
>>>>>> bus->address_space_mem = address_space_mem;
>>>>>> bus->address_space_io = address_space_io;
>>>>>> + bus->isol_as_mem = NULL;
>>>>>> + bus->isol_as_io = NULL;
>>>>>> bus->flags |= PCI_BUS_IS_ROOT;
>>>>>>
>>>>>> /* host bridge */
>>>>>> @@ -2676,6 +2678,16 @@ MemoryRegion *pci_address_space_io(PCIDevice *dev)
>>>>>> return pci_get_bus(dev)->address_space_io;
>>>>>> }
>>>>>>
>>>>>> +AddressSpace *pci_isol_as_mem(PCIDevice *dev)
>>>>>> +{
>>>>>> + return pci_get_bus(dev)->isol_as_mem;
>>>>>> +}
>>>>>> +
>>>>>> +AddressSpace *pci_isol_as_io(PCIDevice *dev)
>>>>>> +{
>>>>>> + return pci_get_bus(dev)->isol_as_io;
>>>>>> +}
>>>>>> +
>>>>>> static void pci_device_class_init(ObjectClass *klass, void *data)
>>>>>> {
>>>>>> DeviceClass *k = DEVICE_CLASS(klass);
>>>>>> @@ -2699,6 +2711,7 @@ static void pci_device_class_base_init(ObjectClass
>>>>>> *klass, void *data)
>>>>>>
>>>>>> AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
>>>>>> {
>>>>>> + AddressSpace *iommu_as = NULL;
>>>>>> PCIBus *bus = pci_get_bus(dev);
>>>>>> PCIBus *iommu_bus = bus;
>>>>>> uint8_t devfn = dev->devfn;
>>>>>> @@ -2745,6 +2758,10 @@ AddressSpace
>>>>>> *pci_device_iommu_address_space(PCIDevice *dev)
>>>>>> if (!pci_bus_bypass_iommu(bus) && iommu_bus && iommu_bus->iommu_fn) {
>>>>>> return iommu_bus->iommu_fn(bus, iommu_bus->iommu_opaque, devfn);
>>>>>> }
>>>>>> + iommu_as = pci_isol_as_mem(dev);
>>>>>> + if (iommu_as) {
>>>>>> + return iommu_as;
>>>>>> + }
>>>>>> return &address_space_memory;
>>>>>> }
>>>>>>
>>>>>> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
>>>>>> index da34c8ebcd..98366768d2 100644
>>>>>> --- a/hw/pci/pci_bridge.c
>>>>>> +++ b/hw/pci/pci_bridge.c
>>>>>> @@ -383,6 +383,11 @@ void pci_bridge_initfn(PCIDevice *dev, const char
>>>>>> *typename)
>>>>>> sec_bus->address_space_io = &br->address_space_io;
>>>>>> memory_region_init(&br->address_space_io, OBJECT(br), "pci_bridge_io",
>>>>>> 4 * GiB);
>>>>>> +
>>>>>> + /* This PCI bridge puts the sec_bus in its parent's address space */
>>>>>> + sec_bus->isol_as_mem = pci_isol_as_mem(dev);
>>>>>> + sec_bus->isol_as_io = pci_isol_as_io(dev);
>>>>>> +
>>>>>> br->windows = pci_bridge_region_init(br);
>>>>>> QLIST_INIT(&sec_bus->child);
>>>>>> QLIST_INSERT_HEAD(&parent->child, sec_bus, sibling);
>>>>>> --
>>>>>> 2.20.1
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, (continued)
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Michael S. Tsirkin, 2022/01/19
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Jag Raman, 2022/01/20
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Dr. David Alan Gilbert, 2022/01/25
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Jag Raman, 2022/01/26
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Stefan Hajnoczi, 2022/01/26
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Dr. David Alan Gilbert, 2022/01/26
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Michael S. Tsirkin, 2022/01/26
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Stefan Hajnoczi, 2022/01/27
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Michael S. Tsirkin, 2022/01/27
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Dr. David Alan Gilbert, 2022/01/26
- Re: [PATCH v5 03/18] pci: isolated address space for PCI bus,
Jag Raman <=
Re: [PATCH v5 03/18] pci: isolated address space for PCI bus, Stefan Hajnoczi, 2022/01/25
[PATCH v5 02/18] tests/avocado: Specify target VM argument to helper routines, Jagannathan Raman, 2022/01/19
[PATCH v5 05/18] qdev: unplug blocker for devices, Jagannathan Raman, 2022/01/19