qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/4] qemu-img: conditionally discard target on c


From: ronnie sahlberg
Subject: Re: [Qemu-devel] [PATCH 4/4] qemu-img: conditionally discard target on convert
Date: Wed, 17 Jul 2013 08:54:46 -0700

On Wed, Jul 17, 2013 at 3:25 AM, Kevin Wolf <address@hidden> wrote:
> Am 17.07.2013 um 11:58 hat Paolo Bonzini geschrieben:
>> Il 17/07/2013 10:46, Kevin Wolf ha scritto:
>> > Am 15.07.2013 um 12:49 hat Peter Lieven geschrieben:
>> >> if a destination has has_zero_init = 0, but it supports
>> >> discard zeroes use discard to convert the target
>> >> into an all zero device.
>> >>
>> >> Signed-off-by: Peter Lieven <address@hidden>
>> >
>> > Wouldn't it be better to use bdrv_write_zeroes() and extend the
>> > implementation of that to use discard internally in those block drivers
>> > where it makes sense?
>> >
>> > Because here you're not really discarding (i.e. don't care about the
>> > sectors any more), but you want them to be zeroed.
>>
>> I thought the same yesterday when reviewing the series, but I'm not
>> convinced.
>>
>> Discarding is not always the right way to write zeroes, because it can
>> disrupt performance.  It may be fine when you are already going to write
>> a sparse image (as is the case for qemu-img convert), but not in
>> general.  So if you just used write_zeroes, it would have to fall under
>> yet another -drive option (or an extension to "-drive discard").
>
> Maybe we need a flag to bdrv_write_zeroes() that specifies whether to
> use discard or not, kind of like the unmap bit in WRITE SAME.
>
> On the other hand, I think you're right that this is really policy,
> and as such shouldn't be hardcoded based on our guesses, but be
> configurable.
>
>> I think what Peter did is a good compromise in the end.
>
> At the very least it must become a separate function. img_convert() is
> already too big and too deeply nested.
>
>> BTW, Peter and Ronnie: we were assuming that UNMAP with LBPRZ=1 always
>> zeroes blocks, but is that true for unaligned operations?
>
> SBC-3 Rev. 35e, 5.16 READ CAPACITY (16) command:
>
>     "A logical block provisioning read zeros (LBPRZ) bit set to one
>     indicates that, for read commands specifying an unmapped LBA (see
>     4.7.4.5), the device server returns user data set to zero [...]"
>
> So it depends on the block provisioning state of the LBA, not on the
> operations that were performed on it.
>
> 5.28 UNMAP command:
>
>     If the ANCHOR bit in the CDB is set to zero, and the logical unit is
>     thin provisioned (see 4.7.3.3), then the logical block provisioning
>     state for each specified LBA:
>
>     a) should become deallocated;
>     b) may become anchored; or
>     c) may remain unchanged.
>
> So with UNMAP, I think you don't have any guarantees that the LBA
> becomes unmapped and therefore zeroed. It could just keep its current
> data. No matter whether your request was aligned or not.
>

If you check the LogicalBlockLimits VPD page it has :

---
The UNMAP GRANULARITY ALIGNMENT field indicates the LBA of the first
logical block to which the OPTIMAL
field applies. The unmap granularity alignment is used to calculate an
optimal unmap
request starting LBA as follows:
UNMAP GRANULARITY
optimal unmap request starting LBA = (n × optimal unmap granularity) +
unmap granularity alignment
where:
n is zero or any positive integer value;
optimal unmap granularity is the value in the OPTIMAL UNMAP
GRANULARITY field; and
unmap granularity alignment is the value in the UNMAP GRANULARITY
ALIGNMENT field.
An unmap request with a starting LBA that is not optimal may result in
unmap operations on fewer LBAs than
requested.
---

So if Peter checks OPTIMAL_UNMAP_GRANULARITY and
UNMAP_GRANULARITY_ALIGNMENT and make sure that
the starting LBA is   (n * OUG) + UGA   and that the number of blocks
is a multiple of OUG then you should be fine.
Those unmaps should not silently fail.


I think it is reasonable to assume that IF LBPRZ==1 and IF it is an
"optimal unmap request" then the blocks will become unmapped and they
will
read back as 0.


(
Using this check should also make it safe with UNMAP for 4k block
devices that emulate 512b blocks.
Those devices should report OUG==8 accordingly so you dont need to
check what LogicalBlocksPerPhysicalBlockExponent is.
)

regards
ronnie sahlberg



reply via email to

[Prev in Thread] Current Thread [Next in Thread]