qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v4 3/3] hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and P


From: Gregory Price
Subject: Re: [RFC v4 3/3] hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)
Date: Mon, 19 Dec 2022 11:12:34 -0500

On Mon, Dec 19, 2022 at 12:42:11PM +0000, Jonathan Cameron wrote:
> As a process thing, when reworking a patch I picked up for the
> CXL qemu gitlab tree, drop the SOB that I added as it's not relevant
> to the new patch.
> 

ack

> Still no need to post a new version unless you particularly
> want to or there are other changes to make.

ack

> > +Deprecations
> > +------------
> > +
> > +The Type 3 device [memdev] attribute has been deprecated in favor
> > +of the [persistent-memdev] and [volatile-memdev] attributes. [memdev]
> 
> That's not quite correct as it sort of implies we could previously use
> memdev for the volatile case.

An early version of the patch explicitly errored out / warned the user
if they attempted to do this, but that got rebuffed.  This could be
added back in.

> > -    return address_space_read(&ct3d->hostmem_as, dpa_offset, attrs, data, 
> > size);
> > +    if (vmr) {
> > +        if (dpa_offset <= int128_get64(vmr->size)) {
> > +            as = &ct3d->hostvmem_as;
> 
> As a follow up it might be worth exploring if we can combine the two address 
> spaces.
> I'm not keen to do it yet, because of no simple way to test it and it's less 
> obviously
> correct than just having separate address spaces.
> 
> Would involve mapping a the two hostmem regions into a container memory 
> region which
> would be the one we use to build the address space.  Advantage would be that 
> we wouldn't
> need special handling for which region it was in here or the write path as 
> QEMUs
> normal heirarchical memory regions would handle that for us.
> I'm not 100% sure it would work though!
> 

Originally I had tried putting them both into one, with the thought that
since it's technically one device address space there should only be one
way to access the address space instead of two.

After investigating, the address space has to be initialized with a
memdev, and an address space only supports a single memdev, so i settled
on two address spaces in order to keep the memdevs separate (considering
each region may have different attributes).

This opens the question as to how to actually represent a persistent
memory device that can be partitioned as volatile.

Consider the following setup:

device[pmem 512mb, volatile 512 mb]
produces:
    memdev(512mb, pmem) && memdev(512mb, volatile)

But if I partition the device to 256p/768v, when i access data in range
[256mb,512mb), then i have volatile data going into the persistent memory
store by nature of the fact that the capacity is located in that memdev.

An alternative would be to require a vmem region the same size as the
pmem region (+ whatever additional volatile capacity), but this
means using 2x the resources to represent the real capacity of the
device. That's not good.

Another alternative would be to create a wrapper memdev that encompasses
persistent and volatile operations, and then create the partitioning
logic on top of it, such that it can use a single memdev while handling
whatever sematics only apply to each individual region.

The tl;dr here:  Going down to a single address space would probably
require writing a wrapper memdev that abstracts away all the
partitioning logic.  Maybe that's worth it?

> > @@ -744,30 +891,35 @@ static void validate_lsa_access(MemoryRegion *mr, 
> > uint64_t size,
> >  static uint64_t get_lsa(CXLType3Dev *ct3d, void *buf, uint64_t size,
> >                      uint64_t offset)
> >  {
> > -    return size;
> > +    return 0;
> 
> Hmm. Maybe this should return an error, but then we can't use the uint64_t as 
> a return
> value.  As this function would never be called with !ct3d->lsa let's leave it 
> as it stands.
> 
> >  }

I originally wanted to do that, but I chose to keep the current contract
semantics so as to minimize the change set.  I agree though that this
should probably return an error.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]