[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH] hostmem-file: add the 'hmem' option
From: |
Luo, Zhigang |
Subject: |
RE: [PATCH] hostmem-file: add the 'hmem' option |
Date: |
Tue, 10 Dec 2024 19:32:42 +0000 |
[AMD Official Use Only - AMD Internal Distribution Only]
Hi David,
Please check my comments inline.
Thanks,
Zhigang
> -----Original Message-----
> From: David Hildenbrand <david@redhat.com>
> Sent: Monday, December 9, 2024 4:11 PM
> To: Luo, Zhigang <Zhigang.Luo@amd.com>; qemu-devel@nongnu.org
> Cc: kraxel@redhat.com; Igor Mammedov <imammedo@redhat.com>
> Subject: Re: [PATCH] hostmem-file: add the 'hmem' option
>
> On 06.12.24 18:58, Luo, Zhigang wrote:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> > Hi David,
>
> Hi!
>
> >
> > Thanks for your comments.
> > Let me give you some background for this patch.
> > I am currently engaged in a project that requires to pass the EFI_MEMORY_SP
> (Special Purpose Memory) type memory from host to a virtual machine within
> QEMU. This memory needs to be EFI_MEMORY_SP type in the virtual machine as
> well.
> > This particular memory type is essential for the functionality of my
> > project.
>
> Which exact guest memory will be backed by this memory? All guest-memory?
[Luo, Zhigang] not all guest-memory. Only the memory reserved for specific
device.
>
> And, what is the guest OS going to do with this memory?
[Luo, Zhigang] the device driver in guest will use this reserved memory.
>
> Usually, this SP memory (dax, cxl, ...) is not used as boot memory. Like on a
> bare
> metal system, one would expect that only CXL memory will be marked as special
> and put aside to the cxl driver, such that the OS can boot on ordinary DIMMs,
> such
> that cxl can online it etc.
>
> So maybe you would want to expose this memory using CXL-mem device to the
> VM? Or a DIMM?
>
> I assume the alternative is to tell the VM on the Linux kernel cmdline to set
> EFI_MEMORY_SP on this memory. I recall that there is a way to achieve that.
>
[Luo, Zhigang] I know this option. but it requires the end user to know where
is the memory location in guest side(start address, size).
> > In Linux, the SPM memory will be claimed by hmem-dax driver by default. With
> this patch I can use the following config to pass the SPM memory to guest VM.
> > -object
> > memory-backend-file,size=30G,id=m1,mem-path=/dev/dax0.0,prealloc=on,al
> > ign=1G,hmem=on
> >
> > I was thinking to change the option name from "hmem" to "spm" to avoid
> confusion.
>
> Likely it should be specified elsewhere, that you want specific guest RAM
> ranges to
> be EFI_MEMORY_SP. For a DIMM, it could be a property, similarly maybe for CXL-
> mem devices (no expert on that).
>
> For boot memory / machine memory it could be a machine property. But I'll
> first have
> to learn which ranges you actually want to expose that way, and what the VM
> will do
> with that information.
[Luo, Zhigang] we want to expose the SPM memory reserved for specific device.
And we will pass the SPM memory and the device to guest. Then the device driver
can use the SPM memory in guest side.
>
> >
> > Do you have any suggestions to achieve this more reasonable?
>
> The problem with qemu_ram_foreach_block() is that you would indicate also
> DIMMs,
> virtio-mem, ... and even RAMBlocks that are not even used for backing
> anything to
> the VM as EFI_MEMORY_SP, which is wrong.
[Luo, Zhigang] qemu_ram_foreach_block() will list all memory block, but in
pc_update_hmem_memory(), only the memory block with "hmem" flag will be updated
to SPM memory.
>
> >
> > Thanks,
> > Zhigang
> >
> > -----Original Message-----
> > From: David Hildenbrand <david@redhat.com>
> > Sent: Friday, December 6, 2024 5:08 AM
> > To: Luo, Zhigang <Zhigang.Luo@amd.com>; qemu-devel@nongnu.org
> > Cc: kraxel@redhat.com; Igor Mammedov <imammedo@redhat.com>
> > Subject: Re: [PATCH] hostmem-file: add the 'hmem' option
> >
> > On 04.12.24 18:11, Zhigang Luo wrote:
> >> This boolean option 'hmem' allows users to set a memory region from
> >> memory-backend-file as heterogeneous memory. If 'hmem=on', QEMU will
> >> set the flag RAM_HMEM in the RAM block of the corresponding memory
> >> region and set the e820 type to E820_SOFT_RESERVED for this region.
> >>
> >
> > Hi,
> >
> > ./scripts/get_maintainer.pl is your friend to figure out whom to CC on
> > patches.
> >
> > In general: not a fan. You seem to be abusing memory backend
> > properties
> > + RAM flags to merely modify how memory is going to be exposed in the
> > memory map on x86.
> >
> > It's not even clear why heterogeneous memory should be exposed like
> > that, and how reasonable it is to essentially expose all of guest RAM
> > as E820_SOFT_RESERVED.
> >
> >
> > Note that the whole "pmem=on" case was very different, because it
> > required mmap() modifications.
> >
> >> Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com>
> >> ---
> >> backends/hostmem-file.c | 23 +++++++++++++++++++++++
> >> hw/i386/e820_memory_layout.h | 1 +
> >> hw/i386/pc.c | 16 ++++++++++++++++
> >> include/exec/cpu-common.h | 1 +
> >> include/exec/memory.h | 3 +++
> >> qapi/qom.json | 4 ++++
> >> system/physmem.c | 7 ++++++-
> >> 7 files changed, 54 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c index
> >> 7e5072e33e..5ddfdbaf86 100644
> >> --- a/backends/hostmem-file.c
> >> +++ b/backends/hostmem-file.c
> >> @@ -32,6 +32,7 @@ struct HostMemoryBackendFile {
> >> uint64_t offset;
> >> bool discard_data;
> >> bool is_pmem;
> >> + bool is_hmem;
> >> bool readonly;
> >> OnOffAuto rom;
> >> };
> >> @@ -88,6 +89,7 @@ file_backend_memory_alloc(HostMemoryBackend
> *backend, Error **errp)
> >> ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> >> ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
> >> ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
> >> + ram_flags |= fb->is_hmem ? RAM_HMEM : 0;
> >> ram_flags |= RAM_NAMED_FILE;
> >> return memory_region_init_ram_from_file(&backend->mr,
> OBJECT(backend), name,
> >> backend->size,
> >> fb->align, ram_flags, @@ -256,6 +258,25 @@ static void
> file_memory_backend_set_rom(Object *obj, Visitor *v,
> >> visit_type_OnOffAuto(v, name, &fb->rom, errp);
> >> }
> >>
> >> +static bool file_memory_backend_get_hmem(Object *o, Error **errp) {
> >> + return MEMORY_BACKEND_FILE(o)->is_hmem; }
> >> +
> >> +static void file_memory_backend_set_hmem(Object *o, bool value,
> >> +Error **errp) {
> >> + HostMemoryBackend *backend = MEMORY_BACKEND(o);
> >> + HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
> >> +
> >> + if (host_memory_backend_mr_inited(backend)) {
> >> + error_setg(errp, "cannot change property 'hmem' of %s.",
> >> + object_get_typename(o));
> >> + return;
> >> + }
> >> +
> >> + fb->is_hmem = value;
> >> +}
> >> +
> >> static void file_backend_unparent(Object *obj)
> >> {
> >> HostMemoryBackend *backend = MEMORY_BACKEND(obj); @@ -295,6
> >> +316,8 @@ file_backend_class_init(ObjectClass *oc, void *data)
> >> object_class_property_add_bool(oc, "pmem",
> >> file_memory_backend_get_pmem, file_memory_backend_set_pmem);
> >> #endif
> >> + object_class_property_add_bool(oc, "hmem",
> >> + file_memory_backend_get_hmem, file_memory_backend_set_hmem);
> >> object_class_property_add_bool(oc, "readonly",
> >> file_memory_backend_get_readonly,
> >> file_memory_backend_set_readonly);
> >> diff --git a/hw/i386/e820_memory_layout.h
> >> b/hw/i386/e820_memory_layout.h index b50acfa201..8af6a9cfac 100644
> >> --- a/hw/i386/e820_memory_layout.h
> >> +++ b/hw/i386/e820_memory_layout.h
> >> @@ -15,6 +15,7 @@
> >> #define E820_ACPI 3
> >> #define E820_NVS 4
> >> #define E820_UNUSABLE 5
> >> +#define E820_SOFT_RESERVED 0xEFFFFFFF
> >>
> >> struct e820_entry {
> >> uint64_t address;
> >> diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 317aaca25a..41e9cc276c
> >> 100644
> >> --- a/hw/i386/pc.c
> >> +++ b/hw/i386/pc.c
> >> @@ -785,6 +785,21 @@ static hwaddr pc_max_used_gpa(PCMachineState
> *pcms, uint64_t pci_hole64_size)
> >> return pc_above_4g_end(pcms) - 1;
> >> }
> >>
> >> +static int pc_update_hmem_memory(RAMBlock *rb, void *opaque) {
> >> + X86MachineState *x86ms = opaque;
> >> + ram_addr_t offset;
> >> + ram_addr_t length;
> >> +
> >> + if (qemu_ram_is_hmem(rb)) {
> >> + offset = qemu_ram_get_offset(rb) + (0x100000000ULL - x86ms-
> >below_4g_mem_size);
> >> + length = qemu_ram_get_used_length(rb);
> >> + e820_add_entry(offset, length, E820_SOFT_RESERVED);
> >> + }
> >
> > I am pretty sure this will break in NUMA setups, where we have
> > multiple memory backends mapped in different locations.
> >
> > The whole "(0x100000000ULL - x86ms->below_4g_mem_size)" looks hacky.
> >
> > --
> > Cheers,
> >
> > David / dhildenb
> >
>
>
> --
> Cheers,
>
> David / dhildenb
- [PATCH] hostmem-file: add the 'hmem' option, Zhigang Luo, 2024/12/04
- Re: [PATCH] hostmem-file: add the 'hmem' option, David Hildenbrand, 2024/12/06
- RE: [PATCH] hostmem-file: add the 'hmem' option, Luo, Zhigang, 2024/12/06
- Re: [PATCH] hostmem-file: add the 'hmem' option, David Hildenbrand, 2024/12/09
- RE: [PATCH] hostmem-file: add the 'hmem' option,
Luo, Zhigang <=
- Re: [PATCH] hostmem-file: add the 'hmem' option, David Hildenbrand, 2024/12/10
- RE: [PATCH] hostmem-file: add the 'hmem' option, Luo, Zhigang, 2024/12/10
- Re: [PATCH] hostmem-file: add the 'hmem' option, David Hildenbrand, 2024/12/10
- RE: [PATCH] hostmem-file: add the 'hmem' option, Luo, Zhigang, 2024/12/13
- Re: [PATCH] hostmem-file: add the 'hmem' option, Igor Mammedov, 2024/12/16
- RE: [PATCH] hostmem-file: add the 'hmem' option, Luo, Zhigang, 2024/12/16