qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Image probing: how it can be insecure, and what we coul


From: Markus Armbruster
Subject: Re: [Qemu-devel] Image probing: how it can be insecure, and what we could do about it
Date: Fri, 07 Nov 2014 10:57:36 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Max Reitz <address@hidden> writes:

> On 2014-11-06 at 13:26, Markus Armbruster wrote:
>> Max Reitz <address@hidden> writes:
>>
>>> On 2014-11-04 at 19:45, Markus Armbruster wrote:
>>>> I'll try to explain all solutions fairly.  Isn't easy when you're as
>>>> biased towards one of them as I am.  Please bear with me.
>>>>
>>>>
>>>> = The trust boundary between image contents and meta-data =
>>>>
>>>> A disk image consists of image contents and meta-data.
>>>>
>>>> Example: all of a raw image's contents is image contents.  Leaves just
>>>> file name and attributes for meta-data.
>>>>
>>>> Example: QCOW2 meta-data includes header, header extensions, L1 table,
>>>> L2 tables, ...  The meta-data defines where in the image the actual
>>>> contents is stored.
>>>>
>>>> A guest can access the image contents, not the meta-data.
>>>>
>>>> Image contents you've let an untrusted guest write is untrusted.
>>>>
>>>> Therefore, there's a trust boundary between image contents and
>>>> meta-data.  QEMU has to trust image meta-data, but shouldn't trust image
>>>> contents.  The exact location of the trust boundary depends on the image
>>>> format.
>>>>
>>>>
>>>> = How we instruct QEMU what to trust =
>>>>
>>>> By configuring QEMU to use an image, the user instructs QEMU to trust
>>>> the image's meta-data.
>>>>
>>>> When the user's configuration specifies the image format explicitly, the
>>>> trust boundary is clear.
>>>>
>>>> Else, the trust boundary is ambigous when more than one format is
>>>> possible.
>>>>
>>>> QEMU resolves this ambiguity by picking the first format with the
>>>> highest "score".  Raw format is always possible, and always has the
>>>> lowest score.
>>>>
>>>>
>>>> = How this lets the guest escape isolation =
>>>>
>>>> Unfortunately, this lets the guest shift the trust boundary and escape
>>>> isolation, as follows:
>>>>
>>>> * Expose a raw image to the guest (whether you specify the format=raw or
>>>>     let QEMU guess it doesn't matter).  The complete contents becomes
>>>>     untrusted.
>>>>
>>>> * Reuse the image *without* specifying the raw format.  QEMU guesses the
>>>>     format based on untrusted image contents.  Now QEMU guesses a format
>>>>     chosen by the guest, with meta-data chosen by the guest.  By
>>>>     controlling image meta-data, the malicious guest can access arbitrary
>>>>     files as QEMU, enlarge its storage, and more.  A non-malicious guest
>>>>     can accidentally DoS itself, by writing a pattern probing recognizes.
>>> Thank you for bringing that to my attention. This means that I'm even
>>> more in favor of using Kevin's patches because in fact they don't
>>> break anything.
>> They break things differently.  The difference may or may not matter.
>>
>> Example: innocent guest writes a recognized pattern.
>>
>>    Now: next restart fails, guest DoSed itself until host operator gets
>>    around to adding format=raw to the configuration.  Consequence:
>>    downtime (probably lengthy), but no data corruption.
>>
>>    With Kevin's patch: write fails, guest may or may not handle the
>>    failure gracefully.  Consequences can range from "guest complains to
>>    its logs (who cares)" via "guest stops whatever it's doing and refuses
>>    to continue until its hardware gets fixed (downtime as above)" to
>>    "data corruption".
>
> You somehow seem convinced that writing to sector 0 is a completely
> normal operation. For x86, it isn't, though.
>
> There are only a couple of programs which do that, I can only think of
> partitioning and setting up boot loaders. There's not a myriad of
> programs which would increase the probability of one both writing a
> recognizable pattern *and* not handling EPERM correctly.
>
> I see the probability of both happening at the same time as extremely
> low, not least because there are only a handful of programs which
> access that sector.
>
>> Example: innocent guest first writes a recognized pattern, then
>> overwrites it with a non-recognized pattern.
>>
>>    Now: works.
>>
>>    With Kevin's patch: as above.
>
> Not really, if the guest overwrites the data with a non-recognized
> pattern after EPERM it works as well. The difference here is that you
> won't have the intended data in the meantime.

That's roughly "guest complains to its logs (who cares)" from the text
referenced by "as above".  The other consequences are possibly here,
too.

>> Again, I'm not claiming the differences are serious in practice, only
>> that they exist.
>
> True.
>
>>>> This is CVE-2008-2004.
>>>>
>>>>
>>>> = Aside: other trust boundaries =
>>>>
>>>> Of course, this is not the only trust boundary that matters.  For
>>>> instance, there's normally one between your host and somebody else's
>>>> computers.  Telling QEMU to trust meta-data of some image you got "from
>>>> the internet" violates it.  There's nothing QEMU can do about that.
>>>>
>>>>
>>>> = Insecure usage is easy, secure usage is hard =
>>>>
>>>> The oldest stratum of user interfaces doesn't let you specify the image
>>>> format.  Use of raw images with these is insecure by design.  These
>>>> interfaces are still recommended for human users.
>>>>
>>>> Example of insecure usage: -hda foo.img, where foo.img is raw.
>>>>
>>>> With the next generation of interfaces, specifying the image format is
>>>> optional.  Use of raw images with these is insecure by default.
>>>>
>>>> Example of insecure usage: -drive file=foo.img,index=0,media=cdrom,
>>>> where foo.img is raw.  The -hda above is actually sugar for this.
>>>>
>>>> Equivalent secure usage: add format=raw.
>>>>
>>>> Note that specifying just the top image's format is not enough, you also
>>>> have to specify any backing images' formats.  QCOW2 can optionally store
>>>> the backing image format in the image.  The other COW formats can't.
>>> Well, they can, with "json:". *cough*
>> Point coughingly taken.
>>
>>>> Example of insecure usage: -hda bar.vmdk, where bar.vmdk is a VMDK image
>>>> with a raw backing file.
>>> Yesterday I found out that doesn't seem possible. You apparently can
>>> only use VMDK with VMDK backing files.
>> I figure you're referring to this code in vmdk_create():
>>
>>          if (strcmp(bs->drv->format_name, "vmdk")) {
>>              bdrv_unref(bs);
>>              ret = -EINVAL;
>>              goto exit;
>>          }
>
> Yes. Of course it does depend on probing, which figures, I guess.
>
>>>                                         Other than that, we only have
>>> qcow1 and qed as COW formats which should not be used anyway.
>> qemu-doc.texi calls them "old image format", and qemu-img.texi has them
>> under "Other", "for compatibility with older QEMU versions".  I guess we
>> could do better job telling users they "should not be used anyway".
>>
>> Even in old stuff retained just for compatibility, we should make an
>> effort to plug security holes, make secure usage easy, and guide users
>> away from insecure usage.
>
> Of course, I'm just pointing out that it seems to be only qcow1 which
> is fully subject to this. qcow2 is partially, as you're pointing out
> next.
>
>> Now back to the point I was trying to make in my original message.
>>
>> Replacement example of insecure usage: -hda bar.qcow2, where bar.qcow2
>> is a QCOW2 image with a raw backing file and no backing image format,
>> i.e. created without "-o backing_format=".
>>
>> Then the next paragraph applies:
>>
>>>> Equivalent secure usage: Beats me.  Maybe there's a funky -drive
>>>> backing.whatever to specify the backing image's format.
>> See Kevin's reply for equivalent secure usage.
>>
>>>> With the latest interface blockdev-add, specifying the format is
>>>> mandatory.  Secure, but not really suitable for humans.
>>>>
>>>> Example of secure usage:
>>>>
>>>>       { "execute": "blockdev-add",
>>>>         "arguments": {'options': {
>>>>             'driver': 'raw', 'id':'foo',
>>>>             'file': { 'driver': 'file', 'filename': 'foo.img' } } } }
>>>>
>>>> Insecure usage is easy, secure usage is *hard*.  Even for sophisticated
>>>> users like libvirt developers.  Evidence: libvirt CVE-2010-2237,
>>>> CVE-2010-2238, CVE-2010-2239, and more that didn't get a CVE, like the
>>>> recent accidental probing in drive-mirror.
>>>>
>>>>
>>>> = How can we better guard the trust boundary in QEMU? =
>>>>
>>>> The guest can violate the trust boundary only because
>>>>
>>>> (a) QEMU supports both raw images and image formats, and
>>>>
>>>> (b) QEMU guesses image format from raw image contents, and
>>>>
>>>> (c) given a raw image, guests can change its contents and control a
>>>> future QEMU's format guess.
>>>>
>>>> We can attack one ore more of these three conditions:
>>> I'd like to attack more because any of these steps might be carried
>>> out in another program which thus either becomes vulnerable itself
>>> (which we don't really have to care about, but I'd like to either way)
>>> or which makes qemu vulnerable.
>>>
>>> Having an external program with (a) and (c), this makes qemu
>>> vulnerable if we don't try to forbid (b) or at least make it work
>>> better. Having an external program with (a) and (b), not doing
>>> anything against (c) in qemu makes that external program vulnerable.
>>>
>>>> (a) Outlaw raw images
>>>>
>>>> (b) Don't guess format from untrusted image contents
>>>>
>>>> (c) Prevent "bad" guest writes
>>>>
>>>> Nobody is seriously suggesting we do (a).  It's clearly too late for
>>>> that.  Let's explore the other two in more detail.
>>> And thus I prefer to find and implement solutions for *both* (b) and (c).
>> Quoting myself: "We can attack one ore more of these three conditions".
>
> Yes, and I referred to that by saying "I'd like to attack more".

Understood.

>>>> == Don't guess format from untrusted image contents ==
>>>>
>>>> Several variations of the theme.
>>>>
>>>> Guessing only happens when the user doesn't specify a format, so the
>>>> simplest way to avoid it would be requiring users to always specify the
>>>> format.
>>>>
>>>> PRO: Simple, plugs the hole.
>>>>
>>>> CON: Breaks a lot of existing usage.  Bye -hda, hello extra typing.
>>>>
>>>> Variation: command line option to opt out of probing completely.
>>>>
>>>> PRO: Simple, plugs the hole.
>>>>
>>>> CON: Insecure by default.
>>>>
>>>> CON: In addition to opting out, you have to update your usage
>>>> accordingly.  I guess libvirt would do it anyway, to guard against
>>>> accidental probing once and for all.
>>>>
>>>> Variation: Stefan proposed to make format= mandatory just for -drive.  I
>>>> guess he rather meant mandatory for anything but -hda and other selected
>>>> convenience interfaces.
>>>>
>>>> PRO: No extra typing in the cases where it makes the most difference.
>>>>
>>>> CON: Breaks existing usage in the other cases, extra typing.
>>>>
>>>> CON: Doesn't fully plug the hole.  Users of convenience interfaces may
>>>>       remain insecure out of ignorance.  We could add a warning to guide
>>>>       them to more secure usage, but then that warning would annoy users
>>>>       who don't care for security (sometimes with reason), and we can't
>>>>       have that.  So we silently leave the users who would care if they
>>>>       knew insecure.
>>>>
>>>> I proposed something less radical, namely to keep guessing the image
>>>> format, but base the guess on trusted meta-data only: file name and
>>>> attributes.
>>> You actually want to completely abolish probing? I thought we just
>>> wanted to use the filename as a second source of information and warn
>>> if the contents and the extension don't seem to match; and in the
>>> future, maybe make it an error (but we don't have to discuss that
>>> second part now, I think).
>> Yes, I propose to ditch it completely, after a suitable grace period.  I
>> tried to make that quite clear in my PATCH RFC 2/2.
>>
>> Here's why.
>>
>> Now: 1. probe
>>       4. open, error out if meta-data is bad
>>
>> With my proposed patch:
>>       1. probe
>>       2. guess from trusted meta-data
>>       3. warn unless 1 and 2 match
>>       4. open, error out if meta-data is bad
>>
>> Now make the warning an error:
>>       1. probe
>>       2. guess from trusted meta-data
>>       3. error out unless 1 and 2 match
>>       4. open, error out if meta-data is bad
>>
>> I figure the following is equivalent, but simpler:
>>
>>       2. guess from trusted meta-data
>>       4. open, error out if meta-data is bad
>>
>> because open should detect all the errors the previous step 3 caught.
>> If not, things are broken with explicit format=.
>
> You're right, it seems be equivalent. One difference will probably be
> error messages ("Bad signature" vs. "Filename extension and format do
> not match").

Yes.

> The other difference is that you have a problem if you cannot
> distinguish between two formats by extensions, as we've seen
> already. .qcow could mean both qcow1 and qcow2; a similar problem
> appeared with .vhd. Opening the image using all possible formats seems
> bad to me. Better swap 1 and 2: Guess from commonly trusted metadata
> (i.e. the filename extension) and then probe all possible formats.

Fair enough.

>>>> Block and character special files are raw.  For other
>>>> files, find the file name extension, and look up the format claiming it.
>>>>
>>>> PRO: Plugs the hole.
>>> You mean "plugs hole (b)".
>> What I (airily) call "the hole" is the scenario I described above: guest
>> escaping isolation by subverting qemu-system-*.
>>
>> (b) is not a hole, it's a condition for "the hole".
>
> Your "the hole" is only one hole. There are more holes. You only
> consider the case where there's only qemu, which is technically fine
> for discussion on qemu-devel, but I'd like to consider having other
> tools beside qemu as well.
>
> You're right, though, it should be "prevents condition (b)".
>
>> I guess what you want to say is "attacking just condition (b) doesn't
>> plug some other holes I care about".  That's true.  There are indeed
>> other holes.
>>
>> For instance, guest attacking QEMU's utility programs qemu-img, ...  Or
>> guest attacking other host software.  I'm not trying to discount any of
>> them.  But tangling all problems up into a hairball won't do us any
>> good.
>
> We have two approaches and I think using both will help to plug more
> holes. I'm not saying we should consider every possible hole with
> every host configuration there may be. I'm just saying using only of
> the approaches clearly only plugs "the hole" if there's only qemu.
>
>> So your point that my analysis is narrow is taken.  In my defense, I can
>> say that my narrow analysis was difficult enough to write up, and
>> probably produced enough text to deter readers.
>>
>>>> CON: Breaks existing usage when the new guess differs from the old
>>>>       guess.  Common usage should be fine:
>>>>
>>>>       * -hda test.qcow2
>>>>
>>>>         Fine as long as test.qcow2 is really QCOW2 (as it should!), and
>>>>         either specifies a backing format (as it arguably should), or the
>>>>         backing file name is sane.
>>>>
>>>>       * -hda disk.img
>>>>
>>>>         Fine as long as disk.img is really a disk image (as it should).
>>>>
>>>>       * -hda /dev/mapper/vg0-virtdisk
>>>>
>>>>         Fine as long as the logical volume is raw.
>>>>
>>>>       Less common usage can break:
>>>>
>>>>       * -hda nbd://localhost
>>>>
>>>>         Socket provides no clue, so no guess.
>>> nbd should be raw. If it isn't, you're most likely doing something
>>> wrong. See https://bugzilla.redhat.com/show_bug.cgi?id=1090713 what
>>> happens when you're doing it wrong.
>> Okay.
>>
>> My RFC PATCH is too simplistic to exploit that, because it only looks at
>> file name extensions.  But "user asked for nbd protocol" is quite
>> obviously trusted meta-data.  We could have a less simplistic patch
>> putting it to use.  Or adopt the variation below.
>>
>>>>       Weird usage can conceivably break hard:
>>>>
>>>>       * -hdd disk.img
>>>>
>>>>         Breaks hard when disk.img is actually QCOW2, the guest boots
>>>>         anyway from another drive, then proceeds to overwrite this one.
>>> Then why not continue to do probing and using the extension as a safeguard?
>>>
>>>> Mitigation: lengthy transition period where we warn "this usage is
>>>> insecure, and we'll eventually break it; here's a hint on secure usage".
>>>>
>>>> CON: We delay plugging the hole one more time.  But at least we no
>>>> longer expose our users to it silently.
>>>>
>>>> Jeff pointed out that we want to retain probing in things like qemu-img
>>>> info.
>>>>
>>>> Richard asked for a way for users to ask for insecure probing, say
>>>> format=insecure-probe.  I have no problem with giving users rope when
>>>> they ask for it.
>>>>
>>>> Variation: if file name and attributes are unavailable or provide no
>>>> clue, guess raw.  Same PRO and CON as above, only it avoids breaking a
>>>> few more cases.  For instance, "-hda nbd://localhost" keeps working as
>>>> long as the server serves a raw image.
>>> Which it should be.
>>>
>>>> Smells a bit like too much magic
>>>> to me.
>>>>
>>>> My proposal replaces probing wholesale.  I like that because it results
>>>> in simple, predictable guessing.  Here's a hybrid approach: first guess
>>>> raw vs. non-raw based on trusted meta-data only, and if non-raw, probe.
>>>>
>>>> Nothing the guest writes can affect the raw vs. non-raw decision.  Once
>>>> an image is raw, only the user can make it non-raw, by changing its name
>>>> or attributes.
>>>>
>>>> Two variations: 1. guess raw without a clue, and 2. guess non-raw then.
>>>>
>>>> Again, same PRO and CON as above, only it doesn't break when users give
>>>> their non-raw images weird names.
>>>>
>>>> == Prevent "bad" guest writes ==
>>>>
>>>> Again, several variations, but this time, only the last one is serious,
>>>> the others are just for illustration.
>>>>
>>>> Fail guest writes to those parts of the image that probing may examine
>>>> Can fail only writes to the first few sectors (at worst) of raw images.
>>>>
>>>> PRO: Plugs the hole.
>>>>
>>>> CON: The virtual hardware is defective.  Breaks common guest software
>>>> that writes to the first few sectors, such as boot loaders and
>>>> partitioning tools.  Breaks guest software using the whole device, which
>>>> isn't common, but certainly not unheard of.
>>>>
>>>> Variation: fail only writes of patterns that actually can make probing
>>>> guess something other than raw.
>>>>
>>>> PRO: Still plugs the hole.
>>> You mean "plugs hole (c)".
>> My reply to "plugs hole (b)" applies.
>>
>>>> CON: Except when you upgrade to a version that recognizes more patterns.
>>> Which is better than not plugging hole (c) at all.
>>>
>>>> CON: The virtual hardware is still defective, but the defects are
>>>> minimized.
>>> As you pointed out to us it's already defective and I don't think
>>> anybody ever noticed accidentally.
>> You're right in that probed raw is already defective, with defects
>> delayed to the next restart.  Preventing "bad" guest writes changes the
>> nature of the defects subtly, as I described above.
>
> Sector 0 is rarely ever written. It's not that people write some
> recognizable sequences there all the time but don't notice because
> they are quickly overwritten again.
>
> If nobody hit the problem accidentally until now, I'm certain that
> means that nobody ever wrote any recognizable sequence there while
> using qemu and raw (which is not too rare a combination).
>
>> This and the previous variation also extends them to non-probed raw.
>> The following variations avoid the extension.
>>
>>>> We can hope that partition tables, boot sectors and such
>>>> won't match the patterns, so common guest software hopefully works.
>>> It's worked in the past, that's good enough for me.
>>>
>>>> Guest software using the whole device still breaks, only now it breaks
>>>> later rather than sooner.
>>>>
>>>> Variation: fail writes only on *probed* raw images.
>>>>
>>>> CON: Doesn't fully plug the hole: mixing probed usage (user doesn't
>>>> specify format) with non-probed usage (user specifies format) remains
>>>> insecure.  The guest's write succeeds in non-probed usage, and the guest
>>>> escapes isolation in the next probed usage.
>>>>
>>>> CON: The virtual hardware is still defective, but it now comes with a
>>>> "defective on/off" switch, factory default "defective on".  We could add
>>>> a warning to guide users to switch defective off but then that warning
>>>> would annoy people who don't care to switch it off (sometimes with
>>>> reason), and we can't have that.  So we leave users who would care if
>>>> they knew in the dark.
>>>>
>>>> The two variations can be combined.  This is Kevin's proposal.
>>>>
>>>> CON: Doesn't fully plug the hole: union of both variations' flaws.
>>>>
>>>> CON: The virtual hardware is still defective: interesection of both
>>>> variations' defects.
>>>>
>>>>
>>>> = Conclusion =
>>>>
>>>> This is *my* conclusion.  Yours may be different, and that's okay.  It's
>>>> business, not personal ;)
>>>>
>>>> Secure usage should not be hard.
>>>>
>>>> If we permit insecure usage for convenience or compatibility, we should
>>>> warn the user, unless he clearly asked for it.  Even if that warning
>>>> annoys Kevin and Max.
>>> A warning does not annoy me as long as I know what it means.
>> Good :)
>>
>>>> If you want to suppress it with configure
>>>> --reckless or something, no objections.
>>>>
>>>> Same for defective virtual hardware.
>>>>
>>>> Kevin's proposal to prevent "bad" guest writes has relatively small
>>>> compatibility issues.  It improves things from "insecure by default" to
>>>> "slightly defective by default" as long as you use raw images either
>>>> always probed or always non-probed.  It doesn't help at all when you
>>>> alternate probed and non-probed usage.  Improvement of sorts, but it
>>>> still fails my "secure usage should not be hard" requirement, and that
>>>> requirement is a hard one for me.
>>>>
>>>> My proposal to ditch image contents probing entirely has more serious
>>>> compatibility issues.  In particular, we'd have to forgo sugared
>>>> convenience syntax for a number of less common things.  It definitely
>>>> needs a grace period where all usage we're going to break warns.  On the
>>>> up side, it will actually be secure by default when it's done.
>>>>
>>>> If this is not acceptable, my second choice is actually the command line
>>>> option to opt out of probing completely.  This doesn't address "insecure
>>>> by default" (sadly), but it does at least satisfy my "secure usage
>>>> should not be hard" requirement.
>>>>
>>>> If we should adopt Kevin's proposal against my objections, then I very
>>>> badly want the opt out option on top of it, opting out of both probing
>>>> and "bad" write prevention.
>>>>
>>>> Thanks for hearing me out.
>>> My conclusion: Don't ditch probing. It increases entropy, why would
>>> you ditch probing? Just combine it with the extension and if both
>>> don't seem to match, that's an error.
>> How does probing add value?
>>
>> Compare
>>
>>       1. probe
>>       2. guess from trusted meta-data
>>       3. error out unless 1 and 2 match
>>       4. open, error out if meta-data is bad
>>
>> to just
>>
>>       2. guess from trusted meta-data
>>       4. open, error out if meta-data is bad
>>
>> Let P be the driver recommended by probe, and G be the driver
>> recommended by guess.
>>
>> If P == G, same result: we open with the same driver.
>>
>> Else, if open with G fails, equivalent result: error out in step 3
>> vs. error out in step 4.
>>
>> Else, we have an odd case: one driver's probe accepts (P's), yet a
>> different driver's open succeeds (G's).
>>
>>      If G's probe rejects: is this a bug?  Shouldn't open always fail
>>      when probe rejects?
>>
>>      If G's probe accepts, then probing chose P over G.  Maybe it
>>      shouldn't have.  Or maybe the probing functions are imprecise.
>>      Anyway, keeping probing around makes this an error.  Should it be
>>      one?
>>
>> Am I missing something?
>
> No, see my reply above. I failed to consider that opening an image
> basically is advanced probing.
>
> The only thing I really see which could be missing is the problem of
> having ambiguous filename extensions, and that problem can easily be
> solved by keeping probing.
>
>>> Also, holes (b) and (c) are two different holes. We should fix
>>> both. We should fix (b) so qemu isn't vulnerable and we should fix (c)
>>> so qemu doesn't make other programs which do probe vulnerable.
>> Adjusting terminology, but hopefully not your intent: "the hole" isn't
>> the only hole that matters.  The conditions enable other holes.
>> Therefore, we should attack both (b) and (c).  Attacking (a) is not
>> practical.
>
> Yes. that's what I meant. There are various holes and preventing only
> one of the conditions (b) and (c) only plugs one (or a subset) of
> them, and I can easily see unplugged holes which can be fixed by
> preventing both.

No misunderstanding then.

>> Points taken, except I think we could attack (a) if we really wanted.
>
> Of course, but for that we'd either need some flat qcow2 mode or
> better support for an image format that does have such a flat mode.

Yes.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]