qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [RFC PATCH 1/2] qcow2: Allow checking and repairing cor


From: Alberto Garcia
Subject: Re: [Qemu-block] [RFC PATCH 1/2] qcow2: Allow checking and repairing corrupted internal snapshots
Date: Tue, 27 Feb 2018 13:41:39 +0100
User-agent: Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (i586-pc-linux-gnu)

On Mon 26 Feb 2018 02:40:08 PM CET, Max Reitz wrote:
>> +                        "L1 table is too large; snapshot table entry 
>> corrupted\n",
>> +                        (fix & BDRV_FIX_ERRORS) ? "Deleting" : "ERROR",
>> +                        sn->id_str, sn->name, sn->l1_size);
>> +                found_corruption = true;
>> +            }
>
> This code assumes the snapshot table itself has been valid.  Why should
> it be when it contains garbage entries?
>
>> +
>> +            if (found_corruption) {
>> +                result->corruptions++;
>> +                sn->l1_size = 0; /* Prevent this L1 table from being used */
>> +                if (fix & BDRV_FIX_ERRORS) {
>> +                    ret = qcow2_snapshot_delete(bs, sn->id_str, sn->name, 
>> NULL);
>
> So calling this is actually very dangerous.  It modifies the snapshot
> table which I wouldn't trust is actually just a snapshot table.  It
> could intersect any other structure in the qcow2 image.  Yes, we do an
> overlap check, but that only protects metadata, and I don't really
> want to see an overlap check corruption when repairing the image;
> especially since this means you cannot fix the corruption.

I think you're right. One thing that could be done is check that the
area where the snapshot table is supposed to be doesn't overlap with any
other data structures.

> I don't quite know myself what to do instead, but I guess my main
> point would be: Before any (potentially) destructive changes are made,
> the user should have the chance of still opening the image read-only
> and copying all the data off somewhere else.  Which of course again
> means we shouldn't prevent the user from opening an image because a
> snapshot is broken.

Ok, you've convinced me. I'll see what I can come up with. I guess
initially we can simply add checks for sn->l1_table_offset and
sn->l1_size in the places where they're used.

Berto



reply via email to

[Prev in Thread] Current Thread [Next in Thread]