qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] hostmem-file: add a property 'notrunc' to avoid


From: Igor Mammedov
Subject: Re: [Qemu-devel] [PATCH] hostmem-file: add a property 'notrunc' to avoid data corruption
Date: Fri, 21 Oct 2016 13:07:00 +0200

On Fri, 21 Oct 2016 15:22:10 +0800
Haozhong Zhang <address@hidden> wrote:

> On 10/20/16 17:35 +0200, Igor Mammedov wrote:
> >On Thu, 20 Oct 2016 12:47:34 -0200
> >Eduardo Habkost <address@hidden> wrote:
> >  
> >> On Thu, Oct 20, 2016 at 04:15:21PM +0200, Igor Mammedov wrote:  
> >> > On Thu, 20 Oct 2016 11:56:10 -0200
> >> > Eduardo Habkost <address@hidden> wrote:
> >> >  
> >> > > On Thu, Oct 20, 2016 at 03:42:15PM +0200, Igor Mammedov wrote:  
> >> > > > On Thu, 20 Oct 2016 21:11:38 +0800
> >> > > > Haozhong Zhang <address@hidden> wrote:
> >> > > >  
> >> > > > > On 10/20/16 14:34 +0200, Igor Mammedov wrote:  
> >> > > > > >On Thu, 20 Oct 2016 14:13:01 +0800
> >> > > > > >Haozhong Zhang <address@hidden> wrote:
> >> > > > > >  
> >> > > > > >> If a file is used as the backend of memory-backend-file and its 
> >> > > > > >> size is
> >> > > > > >> not identical to the property 'size', the file will be 
> >> > > > > >> truncated. For a
> >> > > > > >> file used as the backend of vNVDIMM, its data is expected to be
> >> > > > > >> persistent and the truncation may corrupt the existing data.  
> >> > > > > >I wonder if it's possible just skip 'size' property in your case 
> >> > > > > >instead
> >> > > > > >'notrunc' property. That way if size is not present one'd get 
> >> > > > > >actual size
> >> > > > > >using get_file_size() and set 'size' to it?
> >> > > > > >And if 'size' is provided and 'size' != file_size then error out.
> >> > > > > >  
> >> > > > >
> >> > > > > I don't know how this can be implemented in QEMU. Specially, how 
> >> > > > > does
> >> > > > > the memory-backend-file know it's used for vNVDIMM, so that it can
> >> > > > > skip the 'size' property?  
> >> > > > Does memory-backend-file needs to know that it's used by NVDIMM?
> >> > > > Looking at nvdimm_realize it doesn't as it's assumes
> >> > > >   hostemem_size == pmem_size + label_size
> >> > > >
> >> > > > I'd make hostmem_file.size optional and take size from file
> >> > > > and if 'size' is specified explictly require it to mach file size.
> >> > > > It's generic and has nothing to do with nvdimm.  
> >> > >
> >> > > We can take size from file, or take size from the
> >> > > host_memory_backend_get_memory() callers.
> >> > >
> >> > > Enumerating all sizes that QEMU can use as input:
> >> > >
> >> > > A) Backend file size
> >> > > B) memory backend "size" option
> >> > > C) frontend-provided size (-numa size, -m, or pc-dimm "size"
> >> > >     property)  
> >> > -numa size affect only anon memory not backend backed one, for
> >> > backend baked memory we use memdev where size comes from backend
> >> >
> >> > pc-dimm.size is readonly and isn't supposed to influence backend.size
> >> >
> >> > I'd drop C option  
> >>
> >> If C is not present, it should be, as it affects the guest ABI
> >> (and the ABI must never depend on the host you are running or
> >> backend configuration, only on the frontend configuration).  
> >I've meant that C should not affect behavior of backend.
> >  
> >> If we are dropping -numa size in favor of the
> >> memory-backend-provided size, that's a bug.  
> >-numa size is not applicable here as it's not using backends,
> >when backends are used it's -numa memdev instead in which case
> >   numa_info[nodenr].node_mem = object_property_get_int(o, "size", NULL);
> >  
> >>  
> >> >  
> >> > >
> >> > > My suggestion is:
> >> > > * B should be optional.
> >> > > * If B is omitted, we should never truncate the file to a smaller
> >> > >   size.  
> >> > i.e. derive backend.size from filesize if possible (i.e. not hugepages)
> >> >  
> >> > > * If B is omitted, we can use C as the size when mapping the
> >> > >   file.  
> >> > frontend size is the size that's mapped into guest address space.
> >> > it should not influence backend's size in backward direction.
> >> > You may notice pc-dimm.size is not user settable (readonly) property.  
> >>
> >> Frontend size will not influence backend size, it will just
> >> affect the size of the memory region the frontend code will ask
> >> the backend to provide.
> >>
> >> In other words: I believe host_memory_backend_get_memory() needs
> >> a 'size' argument, and that memory allocation could be optionally
> >> delayed to the host_memory_backend_get_memory() call. This way,
> >> we don't need a backend size at all, unless we want the backend
> >> to truncate files or preallocate memory for us.  
> >To not complicate things I'd keep current behavior of
> >  host_memory_backend_get_memory()
> >i.e return MR for all the baked memory and then frontend
> >can split and map it into guest address space as it sees fit
> >using aliases for non trivial cases taking in account frontend's
> >own size/label_size/whatnot properties.
> >That's what NVDIMM does now, it gets MR for whole file and
> >the splits it on data and label areas and maps into GA only
> >data part using memory_region_init_alias().
> >  
> 
> As NVDIMM label is mentioned here, I realize a case that extending
> file size would fail:
> 
> NVDIMM labels record the namespace information of the data area and
> should be persistently stored. The current vNVDIMM implementation in
> QEMU always stores and finds those labels at the end of the given
> backend storage. If a larger size B is given and the backend file is
> extended, QEMU will not be able to find the labels stored on the
> original file, and hence cannot present the same namespaces to guest.
>
> This problem can fixed by deferring the allocation to
> host_memory_backend_get_memory(), and adding a parameter 'notrunc' to
> it to specify the truncation behavior. If no truncation is specified
> and B > A, QEMU will report error and stop.
Lets not go there (I mean extending/deferring) and keep thing simple,
backend object should be completely initialized upon completion of
host_memory_backend_memory_complete() (including creation/mapping of
whatever memory it's backed by) or fail if it can't do so.


How about following behavior:

 1) memory-backend-file,mem-path=/some_dir,size=2G
    - uses truncate to extend temporary file created in "mem-path" to 'size'
    for this case to work 'size' is mandatory

 2) memory-backend-file,mem-path=/existing_file,size=2G
    - uses truncate to extend/shrink file "mem-path" to 'size'
    for this case 'size' could be made optional,
    if we take in account that backend could be used as persistent
    storage then shrinking or extending "mem-path" would be corruption
    as backend has no idea about internal layout if mapped file.
    We can do something like this here:

    if (is_size_opt_provided and size_of(mem-path) != 0) {
       error_out with "mem-path=foo size XXX doesn't match 'size=xxx' option"
    } else if (is_size_opt_provided and size_of(mem-path) == 0) {
       // may be we don't need this case and
       // just fold this in above error case
       truncate(mem-path) // extend/shrink
    } else {
      set_size_opt(size_of(mem-path))
    }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]