qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Nbd] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_


From: Kevin Wolf
Subject: Re: [Qemu-devel] [Nbd] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command
Date: Fri, 4 Mar 2016 10:54:13 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Am 04.03.2016 um 09:49 hat Wouter Verhelst geschrieben:
> Hi folks,
> 
> (sorry about the lateness of this reply, was busy for the last few weeks)
> 
> On Thu, Feb 18, 2016 at 11:34:04AM +0300, Denis V. Lunev wrote:
> > On 02/18/2016 11:09 AM, Alex Bligh wrote:
> > > On 17 Feb 2016, at 18:10, Denis V. Lunev <address@hidden> wrote:
> > >
> > >> Currently available NBD_CMD_TRIM command can not be used as the
> > >> specification explicitely says that "a client MUST NOT make any
> > >> assumptions about the contents of the export affected by this
> > >> [NBD_CMD_TRIM] command, until overwriting it again with `NBD_CMD_WRITE`"
> > > Would a flag to NBD_CMD_TRIM that says "ensure the written
> > > data is zeroed" not be an easier solution than adding another
> > > very similar command?
> > >
> > > Or (cough) changing the spec?
> > >
> > from the point of the receiver the situation (from my POW) could
> > be different. Let us assume that we are writing to the plain
> > file.
> > 
> > There are 2 type of queries:
> > - pls make the target sparse, i.e. perform FALLOC_FL_PUNCH_HOLE
> >    and there is no problem that the operation could not be performed,
> >    this is a hint;
> 
> This is what NBD_CMD_TRIM does, currently.
> 
> The reason this is a hint, is that there is no guarantee that the
> underlying operating system or storage even supports
> FALLOC_FL_PUNCH_HOLE (or similar). We could have made NBD_CMD_TRIM fail
> with a "not possible on this export" kind of error in that case, but it
> was chosen not to do that (for reasons I don't remember; maybe we just
> didn't consider this enough).
> 
> This could be remedied if the client could somehow ask what the result
> of a TRIM command would be; i.e., if the server has support for
> FALLOC_FL_PUNCH_HOLE, it could set a flag which would let the client
> know that NBD_CMD_TRIM will zero out bytes. If the server doesn't set
> that flag and the client requires zeroes, it could then just issue a
> WRITE command, followed (maybe) by a TRIM for the same region (which
> would be less optimal, but have the same result with older servers)

NBD_CMD_TRIM covers the case "I don't need this data any more, you can
throw it away", and I think treating that purely as a hint is perfectly
fine.

> > - pls write the following amount of zeroes in either way (even calling
> >    write directly), i.e. ensure that the data is zeroed and the space on
> >    the file system is allocated for that.
> 
> IOW, you *don't* want to have a sparse file in that case? Or do I
> misunderstand things here?

I think what we're looking for is more like "zero out this area, feel
free to use whatever method is most efficient to achieve that".

So if the server knows that the backing store supports an efficient way
to write zeros (e.g. FALLOC_FL_ZERO_RANGE), it will use that. Otherwise,
if TRIM works and we know that the result is zeroed space instead of
undefined contents, the server is free to use it. And if even that
fails, it just falls back to an explicit write of a zeroed buffer.

If we want, we can give the client a little more control about whether
or not discarding in the process is allowed (or maybe even preferred).
qemu's interface for writing zeros has a BDRV_REQ_MAY_UNMAP flag, for
example.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]