qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 16/17] block: Make bdrv_is_allocated() byte-base


From: Eric Blake
Subject: Re: [Qemu-block] [PATCH 16/17] block: Make bdrv_is_allocated() byte-based
Date: Wed, 10 May 2017 17:33:49 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0

On 04/19/2017 04:40 PM, John Snow wrote:
> 
> 
> On 04/19/2017 05:12 PM, Eric Blake wrote:
>> On 04/19/2017 03:32 PM, John Snow wrote:
>>
>>>> @@ -279,9 +280,9 @@ static int mig_save_device_bulk(QEMUFile *f, 
>>>> BlkMigDevState *bmds)
>>>>          /* Skip unallocated sectors; intentionally treats failure as
>>>>           * an allocated sector */
>>>>          while (cur_sector < total_sectors &&
>>>> -               !bdrv_is_allocated(blk_bs(bb), cur_sector,
>>>> -                                  MAX_IS_ALLOCATED_SEARCH, &nr_sectors)) {
>>>> -            cur_sector += nr_sectors;
>>>> +               !bdrv_is_allocated(blk_bs(bb), cur_sector * 
>>>> BDRV_SECTOR_SIZE,
>>>> +                                  MAX_IS_ALLOCATED_SEARCH, &count)) {
>>>> +            cur_sector += DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
>>>>          }
>>>
>>> Let's try asking again with the right vocabulary:
>>>
>>> OK, so I guess the idea is that we definitely shouldn't ever have a
>>> partially allocated sector; so I suppose this ROUND_UP is just a precaution?
>>
>> For now, yes - but the part 3 series (bdrv_get_block_status) makes
>> returning a sub-sector value possible.  For most drivers, it will only
>> happen at end-of-file (as status doesn't change mid-sector unless your
>> file is not a multiple of a sector size) - in fact, I found that I had
>> to patch the block layer to round up and sub-sector return that
>> coincides with end-of-file back to a round sector amount in order to
>> avoid breaking other code that assumed everything is allocated in sectors.
> 
> I'm sorry, I'm having difficulty parsing your last sentence.

We should NEVER have an allocation smaller than
bds->bl.request_alignment (for example, if you can't write in chunks
smaller than 512, then how can your allocation be smaller than that?).
But when we eventually have byte-granularity support, and for a device
where request_alignment is 1, it is feasible that we could track
mid-sector alignment changes.  In practice, we DO have an
easy-to-observe mid-sector alignment change: on a regular file not
aligned to sector boundaries, where request_alignment IS 1, we get a
mid-sector transition from data to hole at EOF.  So it turned out to be
easier in my later series on bdrv_get_block_status() to intentionally
round a partial sector at end-of-file up as if the entire sector was
allocated, rather than reporting the mid-sector alignment (matching the
fact that bdrv_getlength() rounds up to sector boundaries).

If, someday, we fix bdrv_getlength() to report a non-sector-aligned
value, we'll have to revisit the rounding (again, it feels like
bds->bl.request_alignment is the obvious place to ensure that we never
report a mid-alignment result, but where the alignment size is now
driver-dependent instead of hard-coded to 512).  In fact, even with
non-sector-aligned bdrv_getlength(), we may STILL be able to enforce
rules such as mid-sector transitions from hole->data are not possible
(regardless of request_alignment), and mid-sector transitions from
data->hole are possible only at the end of the file.

> I agree with you that there is absolutely definitely no real problem
> here right now, but I do secretly wonder if we'll invent a way to screw
> ourselves over.
> 
> Presumably it will get converted at SOME point like everything else, and
> it won't be a concern anymore.
> 
> Presumably Presumably that will even happen before we invent a way to
> screw ourselves over.

Here's hoping we are careful enough, and have enough asserts in the
right place so that if we guessed wrong we get a nice assert (rather
than risking inf-looping because we rounded a sub-sector down to 0, or
causing data corruption that doesn't show up until much later where we
rounded up and missed copying something); at the same time, that the
asserts are accurate enough that we can't trip them from legitimate users.

One thing I plan to add to my v2 posting (for the part 3
bdrv_get_block_status series) is to remove the 'qemu-io alloc' sector
alignment restrictions, to have a tool that makes it easier to prove
that I am properly handling rounding at the block layer out to
request_alignment limits without triggering any asserts.  And my recent
blkdebug enhancements are probably going to be helpful in validating
things, even if I have to add more iotests.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]