qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/1] NBD proto: add WRITE_ZEROES extension


From: Alex Bligh
Subject: Re: [Qemu-devel] [PATCH v2 1/1] NBD proto: add WRITE_ZEROES extension
Date: Thu, 31 Mar 2016 15:27:52 +0100

On 31 Mar 2016, at 14:55, Paolo Bonzini <address@hidden> wrote:

> On 31/03/2016 15:53, Alex Bligh wrote:
>>>> +    If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the 
>>>> command
>>>> +    flags field, the server MAY use trimming to zero out the area, but it
>>>> +    MUST ensure that the data reads back as zero.
>>>> +
>> Can you give an example of a situation where the client would not set this
>> and it would be undesirable for the server to create a 'hole' using
>> 'trim' type technology, even when the client doesn't specify it?
>> I suspect there are already some backends (e.g. ceph on qemu-nbd) which
>> will effectively do a 'trim' if you write 4k of zeroes even under
>> current circumstances.
>> 
>> IE why not always permit trimming PROVIDED the data always reads back
>> as zero? This would be far simpler.
> 
> Because trimming can make future operations more expensive and cause
> fragmentation (which may not be as bad as it used to be at the media
> level, but it is still somewhat bad at the filesystem level).
> 
> So if you want a fully-provisioned file, the simplest way to do so is to
> write zeroes to it, and trimming is undesirable.

But isn't the server in a better position to know this than the
client? EG if the server has a back end implementation (as I suspect
Ceph on qemu-nbd does) which never actually stores all zero blocks,
it won't make a difference, and conceivably you're generating a whole
pile of I/O to avoid sparseness when sparseness might be faster. Take
for example a persistent memory interface, where fragmentation is
irrelevant, and writing piles of zeroes to memory is a waste of time.

and on the same subject

On 31 Mar 2016, at 15:08, Eric Blake <address@hidden> wrote:
> Yes, I can see situations where the client REQUIRES that the server
> write actual zeroes, rather than trimming.  The biggest reason is that
> in an environment where storage can be oversubscribed (multiple sparse
> files that in name occupy more data than the underlying storage
> contains), explicitly writing zeroes without punching a hole guarantees
> that YOUR file has storage allocated to it (whereas if YOUR file is
> trimmed, some other file can then use enough allocation to prevent you
> from actually writing data in place of the hole).  Of course, the client
> can still achieve this by sticking with NBD_CMD_WRITE, but that requires
> more network traffic.

Ditto, the server is surely in a better position to know this. Perhaps
the server KNOWS it doesn't oversubscribe.

On the other hand, a third reason I suppose could be security.

Whatever, the implication that a server may never use a trim type
operation unless NBD_CMD_FLAG_MAY_TRIM is specified seems to me
pretty draconian. I'd prefer this as NBD_CMD_FLAG_NO_TRIM
(as Eric sets out below), and to make it a 'hint', saying the
data SHOULD actually be written out as zeroes for security and
to maintain allocation and lack of sparseness.

A good example of why this can only be a 'SHOULD' would be
a file system that itself is CoW (or perhaps journals
data). Either way, you aren't going to get your space back, you
aren't going to get secure overwriting, and sparseness doesn't
much mean anything.

> However, having written that, I'm thinking we have the wrong sense for
> the flag.  I think it makes more sense to allow trim/hole-punching by
> default (but ONLY when the server can guarantee that reads will still be
> zeroes), and make the flag NBD_CMD_FLAG_NO_TRIM to explicitly specify
> the cases where the server MUST NOT trim but allocate and write actual
> zeroes.  I suspect that explicit allocation requests are less common,
> and also less efficient; so having the default state of the flag geared
> towards efficiency (both in the sense that punching holes can be faster
> than writing zeroes, and that most people LIKE the storage savings of
> sparse files).

I agree with the sense reversal, but I think it should be a SHOULD NOT
(for the reasons set out above), and explaining why would be helpful.

-- 
Alex Bligh




reply via email to

[Prev in Thread] Current Thread [Next in Thread]