[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping an
From: |
Igor Mammedov |
Subject: |
Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area |
Date: |
Thu, 10 Sep 2015 12:35:13 +0200 |
On Tue, 8 Sep 2015 21:38:17 +0800
Xiao Guangrong <address@hidden> wrote:
>
>
> On 09/07/2015 10:11 PM, Igor Mammedov wrote:
> > On Fri, 14 Aug 2015 22:52:01 +0800
> > Xiao Guangrong <address@hidden> wrote:
> >
> >> The parameter @file is used as backed memory for NVDIMM which is
> >> divided into two parts if @dataconfig is true:
> >> - first parts is (0, size - 128K], which is used as PMEM (Persistent
> >> Memory)
> >> - 128K at the end of the file, which is used as Config Data Area, it's
> >> used to store Label namespace data
> >>
> >> The @file supports both regular file and block device, of course we
> >> can assign any these two kinds of files for test and emulation, however,
> >> in the real word for performance reason, we usually used these files as
> >> NVDIMM backed file:
> >> - the regular file in the filesystem with DAX enabled created on NVDIMM
> >> device on host
> >> - the raw PMEM device on host, e,g /dev/pmem0
> >
> > A lot of code in this series could reuse what QEMU already
> > uses for implementing pc-dimm devices.
> >
> > here is common concepts that could be reused.
> > - on physical system both DIMM and NVDIMM devices use
> > the same slots. We could share QEMU's '-m slots' option between
> > both devices. An alternative to not sharing would be to introduce
> > '-machine nvdimm_slots' option.
> > And yes, we need to know number of NVDIMMs to describe
> > them all in ACPI table (taking in amount future hotplug
> > include in this possible NVDIMM devices)
> > I'd go the same way as on real hardware on make them share the same
> > slots.
>
> I'd prefer sharing slots for pc-dimm and nvdimm, it's easier to reuse the
> logic of slot-assignment and plug/unplug.
>
> > - they share the same physical address space and limits
> > on how much memory system can handle. So I'd suggest sharing existing
> > '-m maxmem' option and reuse hotplug_memory address space.
>
> Sounds good to me.
>
> >
> > Essentially what I'm suggesting is to inherit NVDIMM's implementation
> > from pc-dimm reusing all of its code/backends and
> > just override parts that do memory mapping into guest's address space to
> > accommodate NVDIMM's requirements.
>
> Good idea!
>
> We have to differentiate pc-dimm and nvdimm in the common code and nvdimm
> has different points with pc-dimm (for example, its has reserved-region, and
> need support live migration of label data). How about rename 'pc-nvdimm' to
> 'memory-device' and make it as a common device type, then build pc-dimm and
> nvdimm on top of it?
sound good, only I'd call it just 'dimm' as 'memory-device' is too broad.
Also I'd make base class abstract.
>
> Something like:
> static TypeInfo memory_device_info = {
> .name = TYPE_MEM_DEV,
> .parent = TYPE_DEVICE,
> };
>
> static TypeInfo memory_device_info = {
> .name = TYPE_PC_DIMM,
> .parent = TYPE_MEM_DEV,
> };
>
> static TypeInfo memory_device_info = {
> .name = TYPE_NVDIMM,
> .parent = TYPE_MEM_DEV,
> };
>
> It also make CONIFG_NVDIMM and CONFIG_HOT_PLUG be independent.
>
> >
> >>
> >> Signed-off-by: Xiao Guangrong <address@hidden>
> >> ---
> >> hw/mem/nvdimm/pc-nvdimm.c | 109
> >> ++++++++++++++++++++++++++++++++++++++++++++-
> >> include/hw/mem/pc-nvdimm.h | 7 +++
> >> 2 files changed, 115 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/hw/mem/nvdimm/pc-nvdimm.c b/hw/mem/nvdimm/pc-nvdimm.c
> >> index 7a270a8..97710d1 100644
> >> --- a/hw/mem/nvdimm/pc-nvdimm.c
> >> +++ b/hw/mem/nvdimm/pc-nvdimm.c
> >> @@ -22,12 +22,20 @@
> >> * License along with this library; if not, see
> >> <http://www.gnu.org/licenses/>
> >> */
> >>
> >> +#include <sys/mman.h>
> >> +#include <sys/ioctl.h>
> >> +#include <linux/fs.h>
> >> +
> >> +#include "exec/address-spaces.h"
> >> #include "hw/mem/pc-nvdimm.h"
> >>
> >> -#define PAGE_SIZE (1UL << 12)
> >> +#define PAGE_SIZE (1UL << 12)
> >> +
> >> +#define MIN_CONFIG_DATA_SIZE (128 << 10)
> >>
> >> static struct nvdimms_info {
> >> ram_addr_t current_addr;
> >> + int device_index;
> >> } nvdimms_info;
> >>
> >> /* the address range [offset, ~0ULL) is reserved for NVDIMM. */
> >> @@ -37,6 +45,26 @@ void pc_nvdimm_reserve_range(ram_addr_t offset)
> >> nvdimms_info.current_addr = offset;
> >> }
> >>
> >> +static ram_addr_t reserved_range_push(uint64_t size)
> >> +{
> >> + uint64_t current;
> >> +
> >> + current = ROUND_UP(nvdimms_info.current_addr, PAGE_SIZE);
> >> +
> >> + /* do not have enough space? */
> >> + if (current + size < current) {
> >> + return 0;
> >> + }
> >> +
> >> + nvdimms_info.current_addr = current + size;
> >> + return current;
> >> +}
> > You can't use all memory above hotplug_memory area since
> > we have to tell guest where 64-bit PCI window starts,
> > and currently it should start at reserved-memory-end
> > (but it isn't due to a bug: I've just posted fix to qemu-devel
> > "[PATCH 0/2] pc: fix 64-bit PCI window clashing with memory hotplug
> > region"
> > )
>
> Ah, got it, thanks for you pointing it out.
>
> >
> >> +
> >> +static uint32_t new_device_index(void)
> >> +{
> >> + return nvdimms_info.device_index++;
> >> +}
> >> +
> >> static char *get_file(Object *obj, Error **errp)
> >> {
> >> PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
> >> @@ -48,6 +76,11 @@ static void set_file(Object *obj, const char *str,
> >> Error **errp)
> >> {
> >> PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
> >>
> >> + if (memory_region_size(&nvdimm->mr)) {
> >> + error_setg(errp, "cannot change property value");
> >> + return;
> >> + }
> >> +
> >> if (nvdimm->file) {
> >> g_free(nvdimm->file);
> >> }
> >> @@ -76,13 +109,87 @@ static void pc_nvdimm_init(Object *obj)
> >> set_configdata, NULL);
> >> }
> >>
> >> +static uint64_t get_file_size(int fd)
> >> +{
> >> + struct stat stat_buf;
> >> + uint64_t size;
> >> +
> >> + if (fstat(fd, &stat_buf) < 0) {
> >> + return 0;
> >> + }
> >> +
> >> + if (S_ISREG(stat_buf.st_mode)) {
> >> + return stat_buf.st_size;
> >> + }
> >> +
> >> + if (S_ISBLK(stat_buf.st_mode) && !ioctl(fd, BLKGETSIZE64, &size)) {
> >> + return size;
> >> + }
> >> +
> >> + return 0;
> >> +}
> > All this file stuff I'd leave to already existing backends like
> > memory-backend-file or even memory-backend-ram which already do
> > above and more allowing to configure persistent and volatile
> > NVDIMMs without changing NVDIMM fronted code.
> >
>
> The current memory backends use all memory size and map it to guest's
> address space. However, nvdimm needs a reserved region for its label
> data which is only accessed in Qemu.
>
> How about introduce two parameters, "reserved_size" and "reserved_addr"
> to TYPE_MEMORY_BACKEND, then only the memory region
> [0, size - reserved_size) is mapped to guest and the remain part is
> pointed by "reserved_addr"?
Looks like only "reserved_size" is sufficient if there aren't any plans
/reasons to keep labels area not at the end of NVDIMM.
Also keeping reserved area in the same backend => MemoryRegion should
automatically migrate it during live migration.
To separate and map guest visible part data part of backend's MemoryRegion
you could use memory_region_init_alias().