qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hostmem-file: add the 'hmem' option


From: Igor Mammedov
Subject: Re: [PATCH] hostmem-file: add the 'hmem' option
Date: Mon, 16 Dec 2024 15:40:05 +0100

On Tue, 10 Dec 2024 21:51:40 +0000
"Luo, Zhigang" <Zhigang.Luo@amd.com> wrote:

> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> > -----Original Message-----
> > From: David Hildenbrand <david@redhat.com>
> > Sent: Tuesday, December 10, 2024 2:55 PM
> > To: Luo, Zhigang <Zhigang.Luo@amd.com>; qemu-devel@nongnu.org
> > Cc: kraxel@redhat.com; Igor Mammedov <imammedo@redhat.com>
> > Subject: Re: [PATCH] hostmem-file: add the 'hmem' option
> >
> > On 10.12.24 20:32, Luo, Zhigang wrote:  
> > > [AMD Official Use Only - AMD Internal Distribution Only]
> > >
> > > Hi David,
> > >  
> >
> > Hi,
> >  
> > >>>
> > >>> Thanks for your comments.
> > >>> Let me give you some background for this patch.
> > >>> I am currently engaged in a project that requires to pass the
> > >>> EFI_MEMORY_SP  
> > >> (Special Purpose Memory) type memory from host to a virtual machine
> > >> within QEMU. This memory needs to be EFI_MEMORY_SP type in the
> > >> virtual machine as well.  
> > >>> This particular memory type is essential for the functionality of my 
> > >>> project.  
> > >>
> > >> Which exact guest memory will be backed by this memory? All 
> > >> guest-memory?  
> > > [Luo, Zhigang] not all guest-memory. Only the memory reserved for 
> > > specific  
> > device.
> >
> > Can you show me an example QEMU cmdline, and how you would pass that
> > hostmem-file object to the device?
> >  
> [Luo, Zhigang] the following is an example. m1 is the reserved memory for pci 
> device "0000:03:00.0". both the memory and pci device are set to same numa 
> node.
> 
> -object memory-backend-ram,size=8G,id=m0 \
> -object 
> memory-backend-file,size=16G,id=m1,mem-path=/dev/dax0.0,prealloc=on,align=1G,hmem=on
>  \
> -numa node,nodeid=0,memdev=m0 -numa node,nodeid=1,memdev=m1 \
> -device pxb-pcie,id=pcie.1,numa_node=1,bus_nr=2,bus=pcie.0 \
> -device ioh3420,id=pcie_port1,bus=pcie.1,chassis=1 \
> -device vfio-pci,host=0000:03:00.0,id=hostdev0,bus=pcie_port1

Is /dev/dax0.0 a part of host device 0000:03:00.0 that you pass-through to 
guest using vfio?



> 
> > >  
> > >>
> > >> And, what is the guest OS going to do with this memory?  
> > > [Luo, Zhigang] the device driver in guest will use this reserved memory.  
> >
> > Okay, so just like CXL memory.
> >  
> > >  
> > >>
> > >> Usually, this SP memory (dax, cxl, ...) is not used as boot memory.
> > >> Like on a bare metal system, one would expect that only CXL memory
> > >> will be marked as special and put aside to the cxl driver, such that
> > >> the OS can boot on ordinary DIMMs, such that cxl can online it etc.
> > >>
> > >> So maybe you would want to expose this memory using CXL-mem device to
> > >> the VM? Or a DIMM?
> > >>
> > >> I assume the alternative is to tell the VM on the Linux kernel
> > >> cmdline to set EFI_MEMORY_SP on this memory. I recall that there is a 
> > >> way to  
> > achieve that.  
> > >>  
> > > [Luo, Zhigang] I know this option. but it requires the end user to know 
> > > where is the  
> > memory location in guest side(start address, size).
> >
> > Right.
> >  
> > >
> > >  
> > >>> In Linux, the SPM memory will be claimed by hmem-dax driver by
> > >>> default. With  
> > >> this patch I can use the following config to pass the SPM memory to 
> > >> guest VM.  
> > >>> -object
> > >>> memory-backend-file,size=30G,id=m1,mem-path=/dev/dax0.0,prealloc=on,
> > >>> al
> > >>> ign=1G,hmem=on
> > >>>
> > >>> I was thinking to change the option name from "hmem" to "spm" to
> > >>> avoid  
> > >> confusion.
> > >>
> > >> Likely it should be specified elsewhere, that you want specific guest
> > >> RAM ranges to be EFI_MEMORY_SP. For a DIMM, it could be a property,
> > >> similarly maybe for CXL- mem devices (no expert on that).
> > >>
> > >> For boot memory / machine memory it could be a machine property. But
> > >> I'll first have to learn which ranges you actually want to expose
> > >> that way, and what the VM will do with that information.  
> > > [Luo, Zhigang] we want to expose the SPM memory reserved for specific 
> > > device.  
> > And we will pass the SPM memory and the device to guest. Then the device 
> > driver
> > can use the SPM memory in guest side.
> >
> > Then the device driver should likely have a way to configure that, not the 
> > memory
> > backend.
> >
> > After all, the device driver will map it somehow into guest physical 
> > address space
> > (how?).
> >  
> [Luo, Zhigang] from guest view, it's still system memory, but marked as SPM. 
> So, qemu will map the memory to guest physical address space.
> The device driver just claims to use the SPM memory in guest side.
> 
> > >  
> > >>  
> > >>>
> > >>> Do you have any suggestions to achieve this more reasonable?  
> > >>
> > >> The problem with qemu_ram_foreach_block() is that you would indicate
> > >> also DIMMs, virtio-mem, ... and even RAMBlocks that are not even used
> > >> for backing anything to the VM as EFI_MEMORY_SP, which is wrong.  
> > > [Luo, Zhigang] qemu_ram_foreach_block() will list all memory block, but 
> > > in  
> > pc_update_hmem_memory(), only the memory block with "hmem" flag will be
> > updated to SPM memory.
> >
> > Yes, but imagine a user passing such a memory backend to a 
> > DIMM/virtio-mem/boot
> > memory etc. It will have very undesired side effects.
> >  
> [Luo, Zhigang] the user should know what he/she is doing when he/she set the 
> flag for the memory region.
> 
> 
> > --
> > Cheers,
> >
> > David / dhildenb  
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]