qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Questions about vNVDIMM on qemu/KVM


From: Yasunori Goto
Subject: Re: [Qemu-devel] Questions about vNVDIMM on qemu/KVM
Date: Wed, 06 Jun 2018 10:44:45 +0900

> Hi,
>  
> > 
> > > I'm investigating status of vNVDIMM on qemu/KVM,
> > > and I have some questions about it. I'm glad if anyone answer them.
> > > 
> > > In my understanding, qemu/KVM has a feature to show NFIT for guest,
> > > and it will be still updated about platform capability with this patch 
> > > set.
> > > https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg04756.html
> > > 
> > > And libvirt also supports this feature with <memory model='nvdimm'>
> > > https://libvirt.org/formatdomain.html#elementsMemory
> > > 
> > > 
> > > However, virtio-pmem is developing now, and it is better
> > > for archtectures to detect regions of NVDIMM without ACPI (like s390x)
> > > In addition, It is also necessary to flush guest contents on vNVDIMM
> > > who has a backend-file.
> > > 
> > > 
> > > Q1) Does ACPI.NFIT bus of qemu/kvm remain with virtio-pmem?
>         No.
> 
> > >     How do each roles become it if both NFIT and virtio-pmem will be
> > >     available?
> 
>         There are two main use cases:
> 
>         1] DAX memory region pass-through to guest:
>         -------------------------------------------
>         As this region is present in actual physical NVDIMM device and 
> exposed to guest,
>         ACPI/NFIT way is used. If all the persistent memory is used by only 
> this way we 
>         don't need 'virtio-pmem'.
> 
>         2] Emulated DAX memory region in host passed to guest:
>         --------------------------------------------------------
>         If this type of region is exposed to guest, it will be preferable to 
> use
>         'virtio-pmem'. 
> 
>         This is regular host memory which is mmaped in guest address space 
> for emulating 
>         persistent memory. Guest writes are present in host page cache and 
> not assured to be 
>         written on backing disk without an explicit flush/sync call.
>  
>         'virtio-pmem' will solve the problem of flushing guest writes present 
> in host page cache.
>         With filesystems at host which use journal-ling like (ext4, xfs), 
> they automatically call 
>         'fsync' at regular intervals. but still there is not 100% assurance 
> of all write persistence until
>         an explicit flush is done from guest. So, we need an additional fsync 
> to flush guest writes to 
>         backing disk. We are using this approach to avoid using guest page 
> cache and keep page cache management 
>         of all the guests at host side.
>         
>         If both ACPI NFIT and virtio-pmem are present, both will have their 
> corresponding memory regions and 
>         defined by memory type "Persistent shared Memory" in case of 
> virtio-pmem and "Persistent Memory" for 
>         ACPI NVDIMM. This is to differentiate both the memory types.

Ok.

> 
> > >     If my understanding is correct, both NFIT and virtio-pmem is used to
> > >     detect vNVDIMM regions, but only one seems to be necessary....
> > > 
> > >     Otherwize, is the NFIT bus just for keeping compatibility,
> > >     and virtio-pmem is promising way?
> > > 
> > >     
> > > Q2) What bus is(will be?) created for virtio-pmem?
> > >     I could confirm the bus of NFIT is created with <memory
> > >     model='nvdimm'>,
> 
>         For virtio-pmem also its nvdimm bus.
> 
> > >     and I heard other bus will be created for virtio-pmem, but I could not
> > >     find what bus is created concretely.
> > >     ---
> > >       # ndctl list -B
> > >       {
> > >          "provider":"ACPI.NFIT",
> > >          "dev":"ndbus0"
> > >       }
> > >     ---
> > >    
> > >     I think it affects what operations user will be able to, and what
> > >     notification is necessary for vNVDIMM.
> > >     ACPI defines some operations like namespace controll, and notification
> > >     for NVDIMM health status or others.
> > >     (I suppose that other status notification might be necessary for
> > >     vNVDIMM,
> > >      but I'm not sure yet...)
> 
>          For virtio-pmem, we are not providing advance features like 
> namespace and various
>          other features which ACPI/NVDIMM hardware provides. This is just to 
> keep paravirt
>          device simple.

Hmm, I see.
Thank you for your explanation.

Bye.
---
Yasunori Goto


> 
>          Moreover I have not yet looked at ndctl side of things. I am not 
> 100% sure how
>          ndctl will handle 'virtio-pmem'.
> 
>         Adding 'Dan' in loop, he can add his thoughts.
> 
> Thanks,
> Pankaj
> 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]