qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion


From: David Hildenbrand
Subject: Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion
Date: Thu, 18 Jan 2018 17:53:03 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

On 24.11.2017 13:40, Pankaj Gupta wrote:
> 
> Hello,
> 
> Thank you all for all the useful suggestions.
> I want to summarize the discussions so far in the
> thread. Please see below:
> 
>>>>
>>>>> We can go with the "best" interface for what
>>>>> could be a relatively slow flush (fsync on a
>>>>> file on ssd/disk on the host), which requires
>>>>> that the flushing task wait on completion
>>>>> asynchronously.
>>>>
>>>>
>>>> I'd like to clarify the interface of "wait on completion
>>>> asynchronously" and KVM async page fault a bit more.
>>>>
>>>> Current design of async-page-fault only works on RAM rather
>>>> than MMIO, i.e, if the page fault caused by accessing the
>>>> device memory of a emulated device, it needs to go to
>>>> userspace (QEMU) which emulates the operation in vCPU's
>>>> thread.
>>>>
>>>> As i mentioned before the memory region used for vNVDIMM
>>>> flush interface should be MMIO and consider its support
>>>> on other hypervisors, so we do better push this async
>>>> mechanism into the flush interface design itself rather
>>>> than depends on kvm async-page-fault.
>>>
>>> I would expect this interface to be virtio-ring based to queue flush
>>> requests asynchronously to the host.
>>
>> Could we reuse the virtio-blk device, only with a different device id?
> 
> As per previous discussions, there were suggestions on main two parts of the 
> project:
> 
> 1] Expose vNVDIMM memory range to KVM guest.
> 
>    - Add flag in ACPI NFIT table for this new memory type. Do we need NVDIMM 
> spec 
>      changes for this? 
> 
>    - Guest should be able to add this memory in system memory map. Name of 
> the added memory in
>      '/proc/iomem' should be different(shared memory?) than persistent memory 
> as it 
>      does not satisfy exact definition of persistent memory (requires an 
> explicit flush).
> 
>    - Guest should not allow 'device-dax' and other fancy features which are 
> not 
>      virtualization friendly.
> 
> 2] Flushing interface to persist guest changes.
> 
>    - As per suggestion by ChristophH (CCed), we explored options other then 
> virtio like MMIO etc.
>      Looks like most of these options are not use-case friendly. As we want 
> to do fsync on a
>      file on ssd/disk on the host and we cannot make guest vCPU's wait for 
> that time. 
> 
>    - Though adding new driver(virtio-pmem) looks like repeated work and not 
> needed so we can 
>      go with the existing pmem driver and add flush specific to this new 
> memory type.

I'd like to emphasize again, that I would prefer a virtio-pmem only
solution.

There are architectures out there (e.g. s390x) that don't support
NVDIMMs - there is no HW interface to expose any such stuff.

However, with virtio-pmem, we could make it work also on architectures
not having ACPI and friends.

> 
>    - Suggestion by Paolo & Stefan(previously) to use virtio-blk makes sense 
> if just 
>      want a flush vehicle to send guest commands to host and get reply after 
> asynchronous
>      execution. There was previous discussion [1] with Rik & Dan on this.
> 
>     [1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg08373.html 
> 
> Is my understanding correct here?
> 
> Thanks,
> Pankaj  
>  
> 


-- 

Thanks,

David / dhildenb



reply via email to

[Prev in Thread] Current Thread [Next in Thread]