qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 1/8] block: add bdrv_measure() API


From: Max Reitz
Subject: Re: [Qemu-devel] [RFC v2 1/8] block: add bdrv_measure() API
Date: Thu, 16 Mar 2017 05:02:26 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1

On 16.03.2017 04:38, Stefan Hajnoczi wrote:
> On Thu, Mar 16, 2017 at 02:01:03AM +0100, Max Reitz wrote:
>> On 15.03.2017 10:29, Stefan Hajnoczi wrote:
>>> bdrv_measure() provides a conservative maximum for the size of a new
>>> image.  This information is handy if storage needs to be allocated (e.g.
>>> a SAN or an LVM volume) ahead of time.
>>>
>>> Signed-off-by: Stefan Hajnoczi <address@hidden>
>>> ---
>>>  qapi/block-core.json      | 19 +++++++++++++++++++
>>>  include/block/block.h     |  4 ++++
>>>  include/block/block_int.h |  2 ++
>>>  block.c                   | 33 +++++++++++++++++++++++++++++++++
>>>  4 files changed, 58 insertions(+)
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 786b39e..673569d 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -463,6 +463,25 @@
>>>             '*dirty-bitmaps': ['BlockDirtyInfo'] } }
>>>  
>>>  ##
>>> +# @BlockMeasureInfo:
>>> +#
>>> +# Image size calculation information.  This structure describes the size
>>> +# requirements for creating a new image.
>>> +#
>>> +# @required-bytes: Amount of space required for image creation.  This 
>>> value is
>>> +#                  the host file size including sparse file regions.  A 
>>> new 5
>>> +#                  GB raw file therefore has a required size of 5 GB, not 0
>>> +#                  bytes.
>>
>> This should probably note that it's a conservative estimation (and I
>> agree that it should be). It's nice to have it in the commit message but
>> few people are going to run git blame on the QAPI documentation to find
>> out the rest of its story. :-)
> 
> Will fix.
> 
>>> +#
>>> +# @fully-allocated-bytes: Space required once data has been written to all
>>> +#                         sectors
>>> +#
>>> +# Since: 2.10
>>> +##
>>> +{ 'struct': 'BlockMeasureInfo',
>>> +  'data': {'required-bytes': 'int', 'fully-allocated-bytes': 'int'} }
>>> +
>>> +##
>>>  # @query-block:
>>>  #
>>>  # Get a list of BlockInfo for all virtual block devices.
>>> diff --git a/include/block/block.h b/include/block/block.h
>>> index 5149260..43c789f 100644
>>> --- a/include/block/block.h
>>> +++ b/include/block/block.h
>>> @@ -298,6 +298,10 @@ int bdrv_truncate(BdrvChild *child, int64_t offset);
>>>  int64_t bdrv_nb_sectors(BlockDriverState *bs);
>>>  int64_t bdrv_getlength(BlockDriverState *bs);
>>>  int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
>>> +void bdrv_measure(BlockDriver *drv, QemuOpts *opts,
>>> +                  BlockDriverState *in_bs,
>>> +                  BlockMeasureInfo *info,
>>> +                  Error **errp);
>>>  void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
>>>  void bdrv_refresh_limits(BlockDriverState *bs, Error **errp);
>>>  int bdrv_commit(BlockDriverState *bs);
>>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>>> index 6c699ac..45a7fbe 100644
>>> --- a/include/block/block_int.h
>>> +++ b/include/block/block_int.h
>>> @@ -201,6 +201,8 @@ struct BlockDriver {
>>>      int64_t (*bdrv_getlength)(BlockDriverState *bs);
>>>      bool has_variable_length;
>>>      int64_t (*bdrv_get_allocated_file_size)(BlockDriverState *bs);
>>> +    void (*bdrv_measure)(QemuOpts *opts, BlockDriverState *in_bs,
>>> +                         BlockMeasureInfo *info, Error **errp);
>>>  
>>>      int coroutine_fn (*bdrv_co_pwritev_compressed)(BlockDriverState *bs,
>>>          uint64_t offset, uint64_t bytes, QEMUIOVector *qiov);
>>> diff --git a/block.c b/block.c
>>> index cb57370..532a4d1 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -3260,6 +3260,39 @@ int64_t 
>>> bdrv_get_allocated_file_size(BlockDriverState *bs)
>>>      return -ENOTSUP;
>>>  }
>>>  
>>> +/*
>>> + * bdrv_measure:
>>> + * @drv: Format driver
>>> + * @opts: Creation options
>>> + * @in_bs: Existing image containing data for new image (may be NULL)
>>> + * @info: Result object
>>> + * @errp: Error object
>>> + *
>>> + * Calculate file size required to create a new image.
>>> + *
>>> + * If @in_bs is given then space for allocated clusters and zero clusters
>>> + * from that image are included in the calculation.  If @opts contains a
>>> + * backing file that is shared by @in_bs then backing clusters are omitted
>>> + * from the calculation.
>>
>> This seems to run a bit contrary to the documentation of
>> BlockMeasureInfo.required-bytes, and I don't fully understand it either.
>>
>> What does "space for zero clusters" mean? Do zero clusters take space?
>> Does it depend on the image format? (i.e. would they take space for raw
>> but not for qcow2?)
> 
> Yes, zero clusters are an image format-specific feature.  A contrived
> example:
> 
>   in_bs: qcow2 version=3 with zero clusters
>   Output format: qcow2 version=2 (zero clusters not supported!)
>   Image creation options: backing file given
> 
> We must take care to allocate clusters in the new image for zero
> clusters in in_bs.  We cannot simply skip allocating those zero clusters
> since there is a backing file and the contents of the backing file
> must not be visible where there is a zero cluster.
> 
> This is a scenario where zero clusters must be included in the
> size calculation.
> 
> Perhaps this is an internal detail and it shouldn't be mentioned in the
> doc comment?

Probably, yes, or it should be phrased differently so it's clear that
zero clusters do not contribute to the required space if it is possible
to represent them efficiently.

(Or explicitly state that the reason zero clusters may contribute to the
allocation size is that they may have to be converted to allocated
clusters.)

>> And is space for unallocated clusters included or not? Do unallocated
>> clusters without a backing image count as zero clusters?
> 
> This depends on the output image format.  The raw format requires space
> even for unallocated regions.  The qcow2 format is compact and only
> requires space for allocated clusters.

Right, that's what I would have thought.

>> If that space is not included, then it would run contrary to the QAPI
>> documentation which states that it should be included.
> 
> Sorry, the raw format example in the QAPI doc is misleading without a
> qcow2 example.  The point of the raw format example was not to state
> that unallocated regions are included for *all* image formats.  It was
> just to show how the raw format behaves.

Oh, now I see. The "sparse file regions" bit was confusing for me. I
thought about format-specific sparse regions but it's actually about the
protocol level. Maybe that should be made more clear.

> I'll reword things in the next revision.
> 
>> Finally, how are you supposed to check whether the backing file in @opts
>> is shared by @in_bs?
> 
> qemu-img convert -B <base> and -o backing_file=<base> simply do not
> check.  They rely on good old-fashioned^W^W^Wthe bad practice of
> trusting the user to provide valid input.
> 
> qemu-img measure will work like this:
> 1. If the new image has a backing file, use has_zero_init=false
>    semantics.
> 2. Do *not* rely on bdrv_get_block_status_above() because it's hard to
>    check how the backing chains of in_bs and the new image compare.
> 
> This means the result will be conservative - perhaps clusters could have
> been shared with the backing file.

OK, that's what I had hoped. I was asking because this comment seems to
imply that somebody actually checks whether the backing file is shared.

>>> + *
>>> + * If @in_bs is NULL then the calculation includes no allocated clusters
>>> + * unless a preallocation option is given in @opts.
>>
>> But the BlockMeasureInfo.required-bytes documentation states that a new
>> 5 GB raw image should still report 5 GB of required space.
> 
> Even with 0 allocated clusters, the raw format always reports the
> virtual disk size (5 GB).  There is no contradiction here.

Yes, right. For some reason I thought about "allocated clusters" in the
new image rather than assumed allocated clusters in the input. For raw,
every cluster is always fully allocated.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]