qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 0/4] nvdimm: enable flush hint address struc


From: Xiao Guangrong
Subject: Re: [Qemu-devel] [RFC PATCH 0/4] nvdimm: enable flush hint address structure
Date: Thu, 6 Apr 2017 19:46:01 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0



On 04/06/2017 05:58 PM, Haozhong Zhang wrote:
On 04/06/17 17:39 +0800, Xiao Guangrong wrote:


On 31/03/2017 4:41 PM, Haozhong Zhang wrote:
This patch series constructs the flush hint address structures for
nvdimm devices in QEMU.

It's of course not for 2.9. I send it out early in order to get
comments on one point I'm uncertain (see the detailed explanation
below). Thanks for any comments in advance!


Background
---------------
Flush hint address structure is a substructure of NFIT and specifies
one or more addresses, namely Flush Hint Addresses. Software can write
to any one of these flush hint addresses to cause any preceding writes
to the NVDIMM region to be flushed out of the intervening platform
buffers to the targeted NVDIMM. More details can be found in ACPI Spec
6.1, Section 5.2.25.8 "Flush Hint Address Structure".


Why is it RFC?
---------------
RFC is added because I'm not sure whether the way in this patch series
that allocates the guest flush hint addresses is right.

QEMU needs to trap guest accesses (at least for writes) to the flush
hint addresses in order to perform the necessary flush on the host
back store. Therefore, QEMU needs to create IO memory regions that
cover those flush hint addresses. In order to create those IO memory
regions, QEMU needs to know the flush hint addresses or their offsets
to other known memory regions in advance. So far looks good.

Flush hint addresses are in the guest address space. Looking at how
the current NVDIMM ACPI in QEMU allocates the DSM buffer, it's natural
to take the same way for flush hint addresses, i.e. let the guest
firmware allocate from free addresses and patch them in the flush hint
address structure. (*Please correct me If my following understand is wrong*)
However, the current allocation and pointer patching are transparent
to QEMU, so QEMU will be unaware of the flush hint addresses, and
consequently have no way to create corresponding IO memory regions in
order to trap guest accesses.

Er, it is awkward and flush-hint-table is static which may not be
easily patched.


Alternatively, this patch series moves the allocation of flush hint
addresses to QEMU:

1. (Patch 1) We reserve an address range after the end address of each
   nvdimm device. Its size is specified by the user via a new pc-dimm
   option 'reserved-size'.


We should make it only work for nvdimm?


Yes, we can check whether the machine option 'nvdimm' is present when
plugging the nvdimm.

   For the following example,
        -object memory-backend-file,id=mem0,size=4G,...
        -device nvdimm,id=dimm0,memdev=mem0,reserved-size=4K,...
        -device pc-dimm,id=dimm1,...
   if dimm0 is allocated to address N ~ N+4G, the address of dimm1
   will start from N+4G+4K or higher. N+4G ~ N+4G+4K is reserved for
   dimm0.

2. (Patch 4) When NVDIMM ACPI code builds the flush hint address
   structure for each nvdimm device, it will allocate them from the
   above reserved area, e.g. the flush hint addresses of above dimm0
   are allocated in N+4G ~ N+4G+4K. The addresses are known to QEMU in
   this way, so QEMU can easily create IO memory regions for them.

   If the reserved area is not present or too small, QEMU will report
   errors.


We should make 'reserved-size' always be page-aligned and should be
transparent to the user, i.e, automatically reserve 4k if 'flush-hint'
is specified?


4K alignment is already enforced by current memory plug code.

About the automatic reservation, is a non-zero default value
acceptable by qemu design/convention in general?

Needn't make it as a user-visible parameter, just a field contained in
dimm-dev struct or nvdimm-dev struct indicates the reserved size is
okay.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]