qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion


From: Dan Williams
Subject: Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion
Date: Wed, 1 Nov 2017 08:20:06 -0700

> On 11/01/2017 12:25 PM, Dan Williams wrote:
[..]
>> It's not persistent memory if it requires a hypercall to make it
>> persistent. Unless memory writes can be made durable purely with cpu
>> instructions it's dangerous for it to be treated as a PMEM range.
>> Consider a guest that tried to map it with device-dax which has no
>> facility to route requests to a special flushing interface.
>>
>
> Can we separate the concept of flush interface from persistent memory?
> Say there are two APIs, one is used to indicate the memory type (i.e,
> /proc/iomem) and another one indicates the flush interface.
>
> So for existing nvdimm hardwares:
> 1: Persist-memory + CLFLUSH
> 2: Persiste-memory + flush-hint-table (I know Intel does not use it)
>
> and for the virtual nvdimm which backended on normal storage:
> Persist-memory + virtual flush interface

I see the flush interface as fundamental to identifying the media
properties. It's not byte-addressable persistent memory if the
application needs to call a sideband interface to manage writes. This
is why we have pushed for something like the MAP_SYNC interface to
make filesystem-dax actually behave in a way that applications can
safely treat it as persistent memory, and this is also the guarantee
that device-dax provides. Changing the flush interface makes it
distinct and unusable for applications that want to manage data
persistence in userspace.

>>>
>>>> In what way is this "more complicated"? It was trivial to add support
>>>> for the "volatile" NFIT range, this will not be any more complicated
>>>> than that.
>>>>
>>>
>>> Introducing memory type is easy indeed, however, a new flush interface
>>> definition is inevitable, i.e, we need a standard way to discover the
>>> MMIOs to communicate with host.
>>
>>
>> Right, the proposed way to do that for x86 platforms is a new SPA
>> Range GUID type. in the NFIT.
>>
>
> So this SPA is used for both persistent memory region and flush interface?
> Maybe i missed it in previous mails, could you please detail how to do
> it?

Yes, the GUID will specifically identify this range as "Virtio Shared
Memory" (or whatever name survives after a bikeshed debate). The
libnvdimm core then needs to grow a new region type that mostly
behaves the same as a "pmem" region, but drivers/nvdimm/pmem.c grows a
new flush interface to perform the host communication. Device-dax
would be disallowed from attaching to this region type, or we could
grow a new device-dax type that does not allow the raw device to be
mapped, but allows a filesystem mounted on top to manage the flush
interface.

> BTW, please note hypercall is not acceptable for standard, MMIO/PIO regions
> are. (Oh, yes, it depends on Paolo. :))

MMIO/PIO regions works for me, that's not the part of the proposal I'm
concerned about.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]