qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 4/9] block: Return -ENOTSUP rather than asser


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v2 4/9] block: Return -ENOTSUP rather than assert on unaligned discards
Date: Fri, 18 Nov 2016 16:48:50 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

On 11/17/2016 04:01 PM, Max Reitz wrote:
> On 17.11.2016 21:13, Eric Blake wrote:
>> Right now, the block layer rounds discard requests, so that
>> individual drivers are able to assert that discard requests
>> will never be unaligned.  But there are some ISCSI devices
>> that track and coalesce multiple unaligned requests, turning it
>> into an actual discard if the requests eventually cover an
>> entire page, which implies that it is better to always pass
>> discard requests as low down the stack as possible.
>>
>> In isolation, this patch has no semantic effect, since the
>> block layer currently never passes an unaligned request through.
>> But the block layer already has code that silently ignores
>> drivers that return -ENOTSUP for a discard request that cannot
>> be honored (as well as drivers that return 0 even when nothing
>> was done).  But the next patch will update the block layer to
>> fragment discard requests, so that clients are guaranteed that
>> they are either dealing with an unaligned head or tail, or an
>> aligned core, making it similar to the block layer semantics of
>> write zero fragmentation.
>>

>> +++ b/block/iscsi.c
>> @@ -1083,7 +1083,9 @@ coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, 
>> int64_t offset, int count)
>>      struct IscsiTask iTask;
>>      struct unmap_list list;
>>
>> -    assert(is_byte_request_lun_aligned(offset, count, iscsilun));
>> +    if (!is_byte_request_lun_aligned(offset, count, iscsilun)) {
>> +        return -ENOTSUP;
>> +    }
>>
>>      if (!iscsilun->lbp.lbpu) {
>>          /* UNMAP is not supported by the target */
> 
> Next line is:
> 
>>          return 0;
> 
> Hmm... -ENOTSUP would be the obvious return value here, too. That might
> interfere with your next patch, though.

Shouldn't interfere. I guess no one value is better than the other; I
can respin to pick a consistent value (I'd lean towards -ENOTSUP) if you
think it is worth it; but I'd rather get this into 2.8 without worrying
about it.

> 
>> diff --git a/block/qcow2.c b/block/qcow2.c
>> index e22f6dc..7cfcd84 100644
>> --- a/block/qcow2.c
>> +++ b/block/qcow2.c
>> @@ -2491,6 +2491,11 @@ static coroutine_fn int 
>> qcow2_co_pdiscard(BlockDriverState *bs,
>>      int ret;
>>      BDRVQcow2State *s = bs->opaque;
>>
>> +    if (!QEMU_IS_ALIGNED(offset | count, s->cluster_size)) {
> 
> Ha! I like "offset | count".

It only works because we know that qcow2 guarantees that s->cluster_size
is a power of 2 (it does not work at the block layer, where the
bs->bl.pdiscard_align need not be a power of 2).

> 
>> +        assert(count < s->cluster_size);
> 
> Maybe add a comment for this assertion? E.g. "The block layer will only
> generate unaligned discard requests that are smaller than the alignment".

Sure, if the maintainer wants a respin.

> 
>> +        return -ENOTSUP;
>> +    }
>> +
>>      qemu_co_mutex_lock(&s->lock);
>>      ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS,
>>                                   QCOW2_DISCARD_REQUEST, false);
>> diff --git a/block/sheepdog.c b/block/sheepdog.c
>> index 1fb9173..4c9af89 100644
>> --- a/block/sheepdog.c
>> +++ b/block/sheepdog.c
>> @@ -2829,8 +2829,9 @@ static coroutine_fn int 
>> sd_co_pdiscard(BlockDriverState *bs, int64_t offset,
>>      iov.iov_len = sizeof(zero);
>>      discard_iov.iov = &iov;
>>      discard_iov.niov = 1;
>> -    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
>> -    assert((count & (BDRV_SECTOR_SIZE - 1)) == 0);
>> +    if (!QEMU_IS_ALIGNED(offset | count, BDRV_SECTOR_SIZE)) {

Again, works because this is a power of 2.

>> +        return -ENOTSUP;
>> +    }
> 
> Out of interest: Where does sheepdog tell the block layer that requests
> have to be aligned to this value?

It's Magic !  Don't tell anyone that I told you :)

Actually, it's because sheepdog still uses .bdrv_co_readv
(sector-based), and not .bdrv_co_preadv (byte-based), so the block layer
automatically sets bs->bl.request_alignment to BDRV_SECTOR_SIZE for
sheepdog and all other old-school drivers.  The block layer then treats
bs->bl.request_alignment as the very minimum that can be changed in the
image at a time, so it only makes sense that sheepdog can't react to a
discard request aligned below those limits.

This code is weakening from an assertion to an early error return, and
then 5/9 is what starts calling the code even with something aligned
smaller than a sector.

Someday the sheepdog driver may be relaxed to implement byte-based
callbacks, and then we may want to delete the early error return of
-ENOTSUP for requests smaller than 512; but that depends on whether
sheepdog uses bytes or sectors over the wire.

> 
> With this patch, it doesn't matter though, it only did before, so:
> 
> Reviewed-by: Max Reitz <address@hidden>
> 
>>      acb = sd_aio_setup(bs, &discard_iov, offset >> BDRV_SECTOR_BITS,
>>                         count >> BDRV_SECTOR_BITS);
>>      acb->aiocb_type = AIOCB_DISCARD_OBJ;
>>
> 
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]