qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] The unholy encrypted image key mess


From: Markus Armbruster
Subject: Re: [Qemu-devel] The unholy encrypted image key mess
Date: Wed, 05 Mar 2014 09:15:08 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 02/28/2014 02:01 PM, Markus Armbruster wrote:
>> The encrypted images Saturday afternoon hack is a gift that keeps on
>> giving.  I wish we could rip it out and bury it deep.  Or that I could
>> continue to ignore it.  Unfortunately, it looks like I need to touch it
>> to clean up error propagation in the monitor.  So here goes.
>> 
>> A naive user might expect QEMU to require the password right when he
>> asks QEMU to open an encrypted image.  But that's not how it works.
>> 
>> Two bits in BlockDriverState apply: encrypted and valid_key.
>> 
>> Both are unset when BlockDriverState has no image, i.e. after bdrv_new()
>> and bdrv_close().
>> 
>> Opening an image sets encrypted iff the image is encrypted, in driver
>> method bdrv_open().  valid_key remains unset.  Two states: NOKEY and
>> NEEDKEY.  NEEDKEY is the troublemaker.
>> 
>> bdrv_set_key() sets a key.  Fails unless encrypted is set, i.e. it
>> requires state NEEDKEY.  Good.  Also fails if the block driver rejects
>> the password, but no driver does that.  Otherwise, it sets valid_key.
>> Call this state GOTKEY.
>> 
>> The user can open an image via command line, HMP commands change,
>> usb_add (legacy), drive_add, and QMP command blockdev-add.  They all
>> create BlockDriverStates in states NOKEY or NEEDKEY.
>
> Uggh - so there's no current way to hot-plug a device in state GOTKEY
> short of using a two-command sequence?  It would be nicer if hot-plug
> had a way to fail to add encrypted devices unless the user also passes
> the password at the same time, creating the device directly into the
> GOTKEY state.

I can't see why QMP commands would ever want to create in state NEEDKEY.
We could easily avoid it there: give QMP commands creating
BlockDriverStates an optional password parameter, fail the command if
the BDS is encrypted and the password parameter is missing.

QMP command change is crap, and should be replaced rather than fixed.

QMP command blockdev-add is new.  Can we still fix it?  Kevin?

For HMP, we need to make up our minds how to do passwords.

The current way is to tie NEEDKEY to "guest paused".  I hate that.

Another way is to make the commands adding BDS prompt for necessary
passwords.  We still have to deal with state NEEDKEY while we're waiting
for the user's reply.  Need to take care to hide the new BDS.  Create it
anonymous, and publish it only after setting the key?

We'd have to do the same for the command line, of course.

Incompatible change, but since this stuff doesn't really work and really
shouldn't be used...

>> The user can set a key with HMP / QMP command block_passwd.  If it
>> succeeds, state goes from NEEDKEY to GOTKEY.
>> 
>> You can't unpause a guest while a BlockDriverState is in state NEEDKEY:
>> 
>> * QMP command cont fails if any BlockDriverState is in state NEEDKEY.
>> 
>> * HMP command cont tries to be cute then: it picks a BlockDriverState in
>>   state NEEDKEY and prompts for its key.  If this is the only one in
>>   state NEEDKEY, it unpauses the guest as ordered.  Else it silently
>>   doesn't.  The user is expected to know to rerun cont for the next key
>>   prompt.
>
> Arguably, the HMP behavior could be made smarter, to do the loop itself.

Yes, if we stick to the "NEEDKEY tied to guest paused" model, we should
do that.

>> This might make you think that the guest is protected from ever having a
>> block device backed by a BlockDriverState in state NEEDKEY.  Not true!
>> Simply open an image while the guest runs, and hot-plug a device backed
>> by it.  The guest will happily read encrypted garbage from it, and if it
>> writes anything to it, it's not so encrypted anymore.  To add insult to
>> injury, the "protection" kicks in next time you pause the guest.

This one is distinct from the change flaws below, and at least as
serious.

>> Changing media has its wrinkles, too:
>> 
>> * HMP command change first closes the old image, then opens the new
>>   image, and if it's encrypted, it asks for a key.  Okay.
>> 
>> * QMP command change first opens the image, then errors out if it's
>>   encrypted and the password argument is missing, or it's not encrypted
>>   and the password argument is present.  These two errors aren't really
>>   failures; the change succeeds just fine.
>
> Indeed, the 'change' command specifically documents that using 'change'
> on a running guest with an encrypted image is ill-advised, because
> there's no password provided along with the 'change' action, so the user
> is forced to do a 2-command sequence where any window with the guest
> running is hosed before the followup 'block_passwd'.

Mitigating factor: the device model (and thus the guest) is notified of
the eject right away, but it's notified of the load only when the key
gets set.  I figure this reduces, but does not eliminate the risk of
disaster.

> Question: when changing from one encrypted image to another, does the
> second image inherit the password from the first (whether or not it
> decrypts the second), or do we always clean state back to having no
> known password?  The way you worded it, it sounds like the password is
> inherited from the first media to the second?

Clean slate.  Ejecting media puts the BDS into a "has no image" state.
"Both [encrypted and valid_key] are unset when BlockDriverState has no
image".

>>   Clients can detect the former non-failure error (it's class
>>   "DeviceEncrypted"), and use block_passwd to go to state GOTKEY.
>
> This makes sense whether you go from unencrypted to encrypted media, or
> if going to an encrypted media always wipes out the old password state.

Clients need to know that DeviceEncrypted is not a failure in the sense
that changing media failed.  It succeeded, but the new media still needs
a key.

Good command design avoids failing after a visible state transition.

>>   Clients can't reliably distinguish the latter non-failure error from
>>   real errors, because it's class "GenericError".
>
> It seems like this error (providing a password argument for a
> non-encrypted disk) is not possible via the 'change' command.  How
> exactly is this state even triggered?

You're right; this code in qmp_bdrv_open_encrypted() is currently
unreachable.

>> In any case, callback bdrv_dev_change_media_cb() is delayed for
>> encrypted images until we enter state GOTKEY.  As afar as I can tell,
>> nothing stops the device model from reading or writing before it gets
>> the callback, but that would be a device model bug.
>
> And one that the qapi docs specifically warn against.
>
>> 
>> The final funny is device model usb-storage (another one that
>> desperately needs to be buried deep).  Its init() callback
>> usb_msd_initfn_storage() tries to be cute when it detects state NEEDKEY.
>> 
>> If it's running in monitor context (say in HMP/QMP command device_add),
>> it attempts to ask for a key.  In HMP context, it unplugs itself when
>> this fails (I think).  In QMP context, it behaves similar to change: it
>> works, but you get a "DeviceEncrypted" error, and the backend remains in
>> state NEEDKEY.
>> 
>> If it's not running in monitor context, it clears autostart.  No idea
>> why it does that, and I'm not sure it has any effect.  Opening an
>> encrypted image clears autostart already, in blockdev_init().

Can we get rid of this, and make usb-storage behave like any other
hot-pluggable block device model?

>> Thankfully, usb-storage is the only device model that messes with keys.
>> 
>> Questions:
>> 
>> 1. Should we protect guests from state NEEDKEY?
>
> Ideally, yes.
>
>> 
>> 2. If yes, how?
>> 
>>    Pause the guest when something enters state NEEDKEY?  I'd hate that.
>> 
>>    Fail device_add in state NEEDKEY?  Takes care of hot-plug, and
>>    cold-plug is already protected by cont.
>
> The existing QMP 'change' command is stupid.  Add a new 'disk-change'
> command that takes a password argument, and require that the new media
> either be unencrypted (and error if a password was provided) or fully
> reach the GOTKEY state (and error if no password was provided) for the
> command to successfully change the media.

The basic operations are "eject" (open tray, do not touch medium),
"load" (close tray, do not touch medium), "remove-medium" (tray must be
open) and "insert-medium" (likewise).

If more convenience is wanted in QMP, build a command on top that
combines the three into whatever behavior is deemed convenient.

Agree that insert-medium should fail if the BDS is encrypted and the
password argument is missing.  The command should have no effect when it
fails that way.  Easy, because insert-medium does just one thing.  Not
so easy for a command that does several things in a sequence.

I don't care for rejecting a superfluous password argument.  If we do
reject it, I again want "command has no effect when it fails".

>                                            Similarly for hotplug.  That
> is, enforce that hot-plug can never make a NEEDKEY block device visible
> to the guest.

Thus, blockdev-add should match insert-medium: optional parameter
password, fail when the BDS is encrypted and the password is missing,
fail when the BDS is unencrypted and the password is present (if we want
that), command has no effect when it fails.

>> Other bright ideas?
>
> Use the fact that we are calling the next release "2.0" to actually kill
> qemu disk encryption as a horribly botched feature, on the grounds that
> we are doing users a favor by not letting them use broken encryption?

I'd love to wipe the encrypted image slate clean and start over!

Keep encrypted images working in qemu-img, of course, so users can
salvage their "encrypted" data.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]