qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] qcow2: Check the L1 table parameters from all i


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH] qcow2: Check the L1 table parameters from all internal snapshots
Date: Fri, 9 Feb 2018 14:48:29 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

On 2018-02-09 14:35, Alberto Garcia wrote:
> On Fri 09 Feb 2018 02:04:06 PM CET, Max Reitz wrote:
>> I don't like completely refusing to open a qcow2 image if one of the
>> snapshots is invalid, without giving the user any way of fixing it.
>>
>> With this patch, the final two images created in 080 cannot be opened
>> at all (not even with qemu-img info).  Without it, you can, you just
>> can't load the snapshots.  (Well, in one case.  In the other, you can
>> even do that, but that's a bug, as you correctly claim.)
>>
>> More importantly, though, without this patch, you can delete the
>> snapshots.  qemu-img will complain a bit, and you'll have leaks
>> afterwards, but that's nothing qemu-img check can't fix.  With this
>> patch, you can't, because the image cannot be opened at all so it's
>> basically gone for good (unless you get a hex editor).
> 
> What you're saying is correct. The alternative however is to add checks
> everywhere the snapshot table is used.

Yes.

> There's for example qcow2_check_metadata_overlap() that will happily
> allocate a buffer of (l1_size * 8) bytes, that can be up to 32 GB.

It will only do that with QCOW2_OL_INACTIVE_L2, though, which is not
enabled by default and will cost a ton of performance anyway, when used.

(And by the way, I'd be OK with dropping that branch.  Or with
revisiting my good old series that strived to make the overlap checks
faster in general.)

> qcow2_expand_zero_clusters() has the same problem but there g_realloc()
> is used, aborting the process on failure (we should actually use
> g_try_realloc() instead, I'll leave that for a different patch).

And this is only used for qemu-img amend from v3 to v2, offline.  While
aborts are bad, in this case they're not even that bad (because this is
not a common operation at all, and because it's fully offline).

And I'm still not sure how this patch would make anything better.
Without it, qemu-img amend aborts.  Yes, too bad.  So you can still do
qemu-img convert, losing all snapshots, but at least you got something.
(Or we just fix qemu-img amend, that tells you there's something bad
going on, and then you delete the snapshot in question.)

With this patch, your whole data is gone because qemu now refuses to
look at anything in the image.

> If we want to allow opening images with corrupted snapshots perhaps we
> could still do the checks in qcow2_read_snapshots() but instead of
> returning an error add a 'corrupted' flag to QCowSnapshot.

If you think that's simpler than having the proper checks everywhere we
access a snapshot, be my guest.  It does sound reasonable and it might
be nicer for users than having to figure out which snapshot to delete
themselves.

> That solution is not without problems because we'd still need to check
> for that flag everytime the snapshot is used, and that's calling for
> errors in the future.

Well, errors if your file is corrupted.  Not a good thing but I still
prefer it over rejecting files altogether because some non-vital data
structure is corrupted, with no chance of allowing the user to fix the
issue.

An intermediate thing would be to mark the whole image corrupt if some
snapshot is broken and then adding functionality to qemu-img check to
fix that (by deleting the snapshot, if the user has agreed to that).
But whatever we do, there needs to be a way for the user to get all
non-corrupted data off the image.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]