qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extensi


From: Eric Blake
Subject: Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension
Date: Thu, 24 Mar 2016 09:25:27 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 03/24/2016 05:55 AM, Paolo Bonzini wrote:
>> As Eric noted, please expand LBA at least once.
> 
> Let's just use "block" (e.g. NBD_CMD_GET_BLOCK_STATUS).

Yes, avoiding the term LBA and using BLOCK everywhere also nicely solves
the problem of introducing yet more terminology.

> 
>>> +      - 32 bits, length of parameter data that follow (unsigned)
>>> +      - zero or more LBA status descriptors, each having the following
>>> +        structure:
>>> +
>>> +        * 64 bits, offset (unsigned)
>>> +        * 32 bits, length (unsigned)
>>> +        * 16 bits, status (unsigned)
>>> +
>>> +    unless an error condition has occurred.
>>> +
> 
> Can we just return one descriptor?  That would simplify the protocol a bit.

As in, the return is exactly one descriptor, consisting of:

* 32 bits, length (unsigned): must be > 0, <= the client's length
* 16 bits, status (unsigned): status of that block

Of course, it means more traffic. The nice part about returning an array
of descriptors is that I can learn the status of 1G of the file, even if
the file alternates every 512 bytes between extent status, in just one
client call. But returning only a single descriptor at a time means I'd
have to make 2M client calls to learn the same pattern of allocation.
Fortunately, in the common case, allocation patterns tend to not be that
disjoint.

On the other hand, returning only one descriptor at a time (for possibly
less length than the client requested) may be easier when using
lseek(SEEK_DATA/HOLE) as the mechanism for determining the bounds of
each extent, since the server only has to search once per command,
instead of dynamically construct the entire reply.

I don't have any strong opinions on which would be better, but it is
definitely food for thought.

> 
> However, let's make these bits, so that
> 
> NBD_STATE_ALLOCATED (0x1), LBA extent is present on the block device
> NBD_STATE_ZERO (0x2), LBA extent will read as zeroes

Should we flip the sense and call this NBD_STATE_UNALLOCATED (0 means
allocated, 1 means not present), so that an overall status of 0 is a
safe default?  (That is, it should always be safe to state a sector is
allocated when it is not, and always safe to state a sector is not known
to read as zeroes even if that happens to be its contents - all that we
lose by reporting this safe default state is that the client will be
unable to optimize for skipping holes).

>> Either the spec should define what it means for a block to be in a dirty
>> state, or it should not talk about it.
> 
> Here is my attempt:
> 
>     This command is meant to operate in tandem with other (non-NBD)
>     channels to the server.  Generally, a "dirty" block is a block that
>     has been written to by someone, but the exact meaning of "has been
>     written" is left to the implementation.  For example, a virtual
>     machine monitor could provide a (non-NBD) command to start tracking
>     blocks written by the virtual machine.  A backup client then can
>     connect to an NBD server provided by the virtual machine monitor
>     and use NBD_CMD_GET_BLOCK_STATUS only read blocks that the virtual

s/only/to only/

>     machine has changed.

s/changed/changed since it started tracking/

> 
>     An implementation that doesn't track the "dirtiness" state of blocks
>     MUST either fail this command with EINVAL, or mark all blocks as
>     dirty in the descriptor that it returns.

Is it feasible to return zero/allocated/dirty status all at the same
time, or do we want to strictly require two different modes of
operation?  That is, if we are returning zero and allocated as two bits,
can we also return a third bit for dirty/clean?  Should we flip the
sense of the bit, where 0 means dirty and 1 means clean, again so that a
server can always return a status of 0 as the safe default?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]