qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] SUMMARY: Re: [RFC 1/1] nbd (specification): add NBD_CMD


From: Daniel P. Berrange
Subject: Re: [Qemu-devel] SUMMARY: Re: [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command
Date: Fri, 19 Feb 2016 09:11:44 +0000
User-agent: Mutt/1.5.24 (2015-08-30)

On Fri, Feb 19, 2016 at 10:12:09AM +0300, Denis V. Lunev wrote:
> On 02/18/2016 08:23 PM, Denis V. Lunev wrote:
> >On 02/18/2016 07:35 PM, Eric Blake wrote:
> >>On 02/18/2016 02:18 AM, Roman Kagan wrote:
> >>>On Wed, Feb 17, 2016 at 01:58:47PM -0700, Eric Blake wrote:
> >>>>On 02/17/2016 11:10 AM, Denis V. Lunev wrote:
> >>>>>@@ -446,6 +448,11 @@ The following request types exist:
> >>>>>      about the contents of the export affected by this command,
> >>>>>until
> >>>>>      overwriting it again with `NBD_CMD_WRITE`.
> >>>>>  +* `NBD_CMD_WRITE_ZEROES` (6)
> >>>>>+
> >>>>>+    A request to write zeroes. The command is functional
> >>>>>equivalent of
> >>>>>+    the NBD_WRITE_COMMAND but without payload sent through the
> >>>>>channel.
> >>>>This lets us push holes during writes. Do we have the converse
> >>>>operation, that is, an easy way to query if a block of data will
> >>>>read as
> >>>>all zeroes, and therefore the client can bypass reading that portion
> >>>>of
> >>>>the disk (in other words, an equivalent to
> >>>>lseek(SEEK_HOLE/SEEK_DATA))?
> >>>The spec doesn't have anything like that.
> >>>
> >>>OTOH, unlike the write case, where you have all the information and
> >>>just
> >>>choose whether to send normal write or zero write, the extra round-trip
> >>>of a separate SEEK_HOLE/SEEK_DATA request may lead to actually
> >>>degrading
> >>>the overall throughput.
> >>>
> >>>Rather it may be a better idea to add something like sparse read where
> >>>the server would, instead of sending the full length of data in the
> >>>response payload, send a smarter variable-length package with a
> >>>scatter-gather list or a bitmap of used blocks in the beginning, and
> >>>let
> >>>the client decode it and fill the gaps with zeros.
> >>Sure, that would work too, and sounds nicer.  Either way, the point is
> >>that we should strongly consider improving the NBD protocol to allow
> >>more efficient handling of sparse files, in both the push and in the
> >>pull direction.  Qemu already has a desire to use both directions of
> >>improvements, but there are more programs, both clients and servers,
> >>outside of qemu, that could benefit from such protocol improvements.
> >>
> >OK
> >
> >Here is a short summary of features which seems necessary from QEMU point
> >of
> >view:
> >- ability to avoid sending zeroes during write operation. The proposal
> >comes in
> >  the thread-starter letter
> >- ability to request block status (allocate/not allocated) from server.
> >This seems
> >  interesting to preserve "sparseness" of the transferring data
> >- ability to skip zeroes during read operation, i.e. something like READ2
> >command
> >  which will return vector of chunks as a reply
> >
> >All 3 features seem usable for generic NBD use-cases and not only for
> >QEMU.
> >
> >If there are no objections I'll sum this up and come with a specification
> >draft.
> >
> >Den
> >
> >P.S. I have added here all parties which have participated in conversation
> >in
> >       different threads on QEMU side.
> 
> interesting point from a verbal discussion with one of my friends.
> Protocol level compression could eliminate the necessity to
> think about zeroes in channel either from read or from write
> point of views and will also reduce the amount of data to
> transfer.

With compression you have thrown away information about sparseness which
you really want to have when writing out the file on the other end. It
forces you to do memcmp detection of zero regions after decompression
which is CPU intensive.

Compression is a fine as a concept, but it is not a replacement for
handling sparseness directly in the protocol.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]