qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 0/4] vfio: report NUMA nodes for device memory


From: Cédric Le Goater
Subject: Re: [PATCH v1 0/4] vfio: report NUMA nodes for device memory
Date: Fri, 15 Sep 2023 16:19:29 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

Hello Ankit,

On 9/15/23 04:45, ankita@nvidia.com wrote:
From: Ankit Agrawal <ankita@nvidia.com>

For devices which allow CPU to cache coherently access their memory,
it is sensible to expose such memory as NUMA nodes separate from
the sysmem node. Qemu currently do not provide a mechanism for creation
of NUMA nodes associated with a vfio-pci device.

Implement a mechanism to create and associate a set of unique NUMA nodes
with a vfio-pci device.>
NUMA node is created by inserting a series of the unique proximity
domains (PXM) in the VM SRAT ACPI table. The ACPI tables are read once
at the time of bootup by the kernel to determine the NUMA configuration
and is inflexible post that. Hence this feature is incompatible with
device hotplug. The added node range associated with the device is
communicated through ACPI DSD and can be fetched by the VM kernel or
kernel modules. QEMU's VM SRAT and DSD builder code is modified
accordingly.

New command line params are introduced for admin to have a control on
the NUMA node assignment.

This approach seems to bypass the NUMA framework in place in QEMU and
will be a challenge for the upper layers. QEMU is generally used from
libvirt when dealing with KVM guests.

Typically, a command line for a virt machine with NUMA nodes would look
like :

  -object memory-backend-ram,id=ram-node0,size=1G \
  -numa node,nodeid=0,memdev=ram-node0 \
  -object memory-backend-ram,id=ram-node1,size=1G \
  -numa node,nodeid=1,cpus=0-3,memdev=ram-node1

which defines 2 nodes, one with memory and all CPUs and a second with
only memory.

  # numactl -H
  available: 2 nodes (0-1)
  node 0 cpus: 0 1 2 3
  node 0 size: 1003 MB
  node 0 free: 734 MB
  node 1 cpus:
  node 1 size: 975 MB
  node 1 free: 968 MB
  node distances:
  node   0   1
    0:  10  20
    1:  20  10

Could it be a new type of host memory backend ? Have you considered
this approach ?

Thanks,

C.


It is expected for a vfio-pci driver to expose this feature through
sysfs. Presence of the feature is checked to enable these code changes.

Applied over v8.1.0-rc4.

Ankit Agrawal (4):
   vfio: new command line params for device memory NUMA nodes
   vfio: assign default values to node params
   hw/arm/virt-acpi-build: patch guest SRAT for NUMA nodes
   acpi/gpex: patch guest DSDT for dev mem information

  hw/arm/virt-acpi-build.c    |  54 +++++++++++++
  hw/pci-host/gpex-acpi.c     |  69 +++++++++++++++++
  hw/vfio/pci.c               | 146 ++++++++++++++++++++++++++++++++++++
  hw/vfio/pci.h               |   2 +
  include/hw/pci/pci_device.h |   3 +
  5 files changed, 274 insertions(+)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]