qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Image probing: how it can be insecure, and what we coul


From: Markus Armbruster
Subject: Re: [Qemu-devel] Image probing: how it can be insecure, and what we could do about it
Date: Tue, 11 Nov 2014 09:28:21 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Jeff Cody <address@hidden> writes:

> On Mon, Nov 10, 2014 at 11:30:25AM +0100, Markus Armbruster wrote:
>> Kevin Wolf <address@hidden> writes:
>> 
>> > Am 10.11.2014 um 09:12 hat Markus Armbruster geschrieben:
>> >> Jeff Cody <address@hidden> writes:
>> >> > So that would mean .img would always require format=, right?
>> >> >
>> >> > That also implies to me that the only extensions for raw that might
>> >> > not require format= would be .iso and .raw.
>> >> 
>> >> .img means what we choose it to mean.
>> >> 
>> >> If we choose "can mean anything, including raw", then .img always
>> >> requires an explicit format with this approach.
>> >> 
>> >> If we choose "means raw", then same as above, except you can omit
>> >> format=raw, and you become prone to opening existing non-raw formats
>> >> raw, which can be bad.
>> >
>> > My current thoughts about .img are that we need to consider that
>> >
>> > (a) it is occasionally used for multiple image formats and making it
>> >     raw unconditionally is going to cause corruption.
>> >
>> > (b) looking at file extensions is absolutely useless if we exlucde
>> >     .img from the automatic detection because it's still the main
>> >     extension for raw.
>> >
>> > The common case could probably be covered by bringing probing back into
>> > the game: If an .img file successfully probes for a non-raw format,
>> > error out, avoiding the corruption. If it doesn't, assume raw.
>> 
>> I think this is the "refuse to use a format without an explicit format=
>> when any other non-raw format probe accepts" idea, which should combine
>> nicely with all the other ideas proposed so far.
>> 
>> Combining it with the hybrid approach we're discussing here gets us
>> something like this, assuming .img means raw:
>> 
>> 1. Guess possible formats from trusted meta-data
>> 
>>    For foo.img, this yields { raw }.
>> 
>> 2. Probe to pick one
>> 
>>    Since { raw } has just one member, pick it without probing.
>> 
>> 3. Now probe all other non-raw formats, and error out if any accepts
>> 
>>    Protects users from opening a non-raw foo.img raw.
>>
>
> With the exception of fixed-size VHD.
>
> But this problem exists today, so there isn't anything 'new' broken
> there.  And we can fix that if we want by upgrading probing, to pass
> both the first 512 bytes, and the last 512 bytes of the image file.

Careful.  If condition "all members of the set must have the image
contents used by their members' probes within their trust boundary" is
satisfied, then probing can obviously be trusted to pick a member of the
set.  Else, we don't know without further reasoning.

Current code plus Kevin's patch passes the first 512 bytes to the
probing methods.  Actual methods use only parts of it, but what parts
exactly is currently opaque.  As long as our reaoning on probes doesn't
pierce that opacity, only formats that cannot store untrusted data
within the first 512 bytes satisfy the condition.  "raw" surely violates
it.  Whether there are any others I can't say without a review of all
probing methods.  Throwing in the last 512 bytes is prone to make many
more formats violate the condition.

To check the condition, we need to know each format's trust boundary and
what each probe examines.  If we declare both in code, we can check
automatically.  Or with useful approximations, check a somewhat tighter
condition.  That's why I keep harping on this condition.

Let me stress again: the condition is sufficient for trust, but not
necessary.  Here's an example of how probing can trustworthily examine
data outside the intersection of all formats' trusted data: consider a
two-stage probe, where the first stage reliably detects a family of
related formats, and the second stage disambiguates the family members.
Can be trusted even when the second stage examines data outside the
intersection.

The automatic checker sketched above can deal with this: simply declare
only the first stage's data.

>> 1+2 are the hybrid approach, 3 is the "refuse" idea.
>> 
>> Adding 3 is an improvement, from "this usage will now break at runtime,
>> possibly corrupting data" to "this usage will now be rejected cleanly".
>> 
>> 3 should also help catch insufficiently selective probe methods.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]