[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible
From: |
Max Reitz |
Subject: |
Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible |
Date: |
Thu, 21 Nov 2019 16:25:13 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 |
On 21.11.19 15:33, Kevin Wolf wrote:
> Am 21.11.2019 um 13:21 hat Max Reitz geschrieben:
>> On 21.11.19 12:34, Kevin Wolf wrote:
>>> Am 21.11.2019 um 09:59 hat Max Reitz geschrieben:
>>>> On 20.11.19 19:44, Kevin Wolf wrote:
>>>>> When extending the size of an image that has a backing file larger than
>>>>> its old size, make sure that the backing file data doesn't become
>>>>> visible in the guest, but the added area is properly zeroed out.
>>>>>
>>>>> Consider the following scenario where the overlay is shorter than its
>>>>> backing file:
>>>>>
>>>>> base.qcow2: AAAAAAAA
>>>>> overlay.qcow2: BBBB
>>>>>
>>>>> When resizing (extending) overlay.qcow2, the new blocks should not stay
>>>>> unallocated and make the additional As from base.qcow2 visible like
>>>>> before this patch, but zeros should be read.
>>>>>
>>>>> A similar case happens with the various variants of a commit job when an
>>>>> intermediate file is short (- for unallocated):
>>>>>
>>>>> base.qcow2: A-A-AAAA
>>>>> mid.qcow2: BB-B
>>>>> top.qcow2: C--C--C-
>>>>>
>>>>> After commit top.qcow2 to mid.qcow2, the following happens:
>>>>>
>>>>> mid.qcow2: CB-C00C0 (correct result)
>>>>> mid.qcow2: CB-C--C- (before this fix)
>>>>>
>>>>> Without the fix, blocks that previously read as zeros on top.qcow2
>>>>> suddenly turn into A.
>>>>>
>>>>> Signed-off-by: Kevin Wolf <address@hidden>
>>>>> ---
>>>>> block/io.c | 32 ++++++++++++++++++++++++++++++++
>>>>> 1 file changed, 32 insertions(+)
>>>>
>>>> Zeroing the intersection may take some time. So is it right for QMP’s
>>>> block_resize to do this, seeing it is a synchronous operation?
>>>
>>> What else would be right? Returning an error?
>>
>> Going through a deprecation cycle.
>
> And keeping the buggy behaviour for two more releases?
This sounds like you don’t care about adding another bug when it means
fixing this bug. I do care. And so all I was saying is that it seemed
problematic to me to fix the problem in this way, because clearly this
would make block_resize block the monitor in some cases and clearly that
would be a problem, if not a bug.
(And that’s precisely what can be said about the current block_resize
behavior of revealing previous backing file data: It is a problem, but I
wouldn’t call it a full-on bug. It just seems to me that nobody has
ever really thought about it.)
Also, I don’t see this as a really pressing issue for block_resize at
least, because this isn’t a recent regression or anything. It was just
the behavior we had, I believe everyone knew it, we just never thought
about whether it really is the best kind of behavior. So, yes, I’m in
absolutely no hurry to fix this for block_resize. (Commit is a
different story, but then again commit is a job already, so it doesn’t
suffer from the blocking issue.)
But of course this wasn’t all that you’re saying, you give a very good
point next.
>>> Common cases (raw and qcow2 v3 without external data files) are quick
>>> anyway.
>>
>> Well, but quick enough for a fully blocking operation?> For qcow2,
>> block_resize can already block for a relatively long time
> while metadata tables are resized, clusters are discarded etc. I just
> don't really see the difference in quality between that and allocating
> some zero clusters in a corner case like having a short overlay.
Discarding cropped clusters when shrinking is a good point. I didn’t
think of that. So block_resize already has the very problem I was
afraid you’d introduce, because of course discarding isn’t very
different from quick-zeroing.
> Would you feel more comfortable if we set BDRV_REQ_NO_FALLBACK and
> return an error if we can't zero out the area? We would have to
> advertise that flag in bs->supported_zero_flags for qcow2 then (but
> probably we should do that anyway?)
Hm. I don’t feel that bad about disallowing this edge case (growing a
shrunken overlay) for potentially all non-qcow2v3 formats. I don’t know
how bad it would be for users other than block_resize, though.
(And I suppose we want a resize job anyway then, and that would work for
those cases?)
>>>> As far as I can tell, jobs actually have the same problem. I don’t
>>>> think mirror or commit have a pause point before truncating, so they
>>>> still block the monitor there, don’t they?
>>>
>>> Do you really need a pause point? They call bdrv_co_truncate() from
>>> inside the job coroutine, so it will yield. I would expect that this
>>> is enough.
>>
>> OK then.
>>
>>> But in fact, all jobs have a pause point before even calling .run(), so
>>> even if that made a difference, it should still be fine.
>>
>> Good.
>>
>> But I believe this is still a problem for block_resize. I don’t see why
>> this needs to be fixed in 4.2-rc3. What’s the problem with going
>> through a proper deprecation cycle other than the fact that we can’t
>> start it in 4.2 because we don’t have a resize job yet?
>
> That the behaviour is wrong.
I know a couple of wrong behaviors that won’t be fixed in 4.2.
> For commit it's an image corruptor.
That’s a reason why we need it in 4.2. It’s no reason why we need it
for block_resize.
> For resize, I'll admit that it's
> just wrong behaviour that is probably harmless in most cases, but it's
> still wrong behaviour. It would be a corruptor in the same way as commit
> if it allowed resizing intermediate nodes, but I _think_ the old-style
> op blockers still forbid this. We'd have to double-check this if we
> leave things broken for block_resize.
>> Anyway, so are you suggesting adding another parameter to
> bdrv_co_truncate() that enables wrong, but quicker semantics, and that
> would only be used by block_resize?
That would certainly be a possibility, yes.
I like your suggestion of only allowing it with NO_FALLBACK, but I
suppose we’d want the same parameter for that, too, because users other
than block_resize probably prefer a slow zeroing over an error.
So to me the question then is whether the added complexity is worth it
for an rc3.
Max
signature.asc
Description: OpenPGP digital signature
- [PATCH for-4.2? v2 0/6] block: Fix resize (extending) of short overlays, Kevin Wolf, 2019/11/20
- [PATCH v2 1/6] block: bdrv_co_do_pwrite_zeroes: 64 bit 'bytes' parameter, Kevin Wolf, 2019/11/20
- [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Kevin Wolf, 2019/11/20
- Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Eric Blake, 2019/11/20
- Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Max Reitz, 2019/11/21
- Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Vladimir Sementsov-Ogievskiy, 2019/11/21
- Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Kevin Wolf, 2019/11/21
- Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Max Reitz, 2019/11/21
- Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Kevin Wolf, 2019/11/21
- Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible,
Max Reitz <=
- Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Alberto Garcia, 2019/11/22
Re: [PATCH v2 2/6] block: truncate: Don't make backing file data visible, Alberto Garcia, 2019/11/22
[PATCH v2 3/6] iotests: Add qemu_io_log(), Kevin Wolf, 2019/11/20
[PATCH v2 5/6] iotests: Support job-complete in run_job(), Kevin Wolf, 2019/11/20
[PATCH v2 4/6] iotests: Fix timeout in run_job(), Kevin Wolf, 2019/11/20
[PATCH v2 6/6] iotests: Test committing to short backing file, Kevin Wolf, 2019/11/20