qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH for-2.9 v2] virtio-crypto: zeroize the key mater


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH for-2.9 v2] virtio-crypto: zeroize the key material before free
Date: Fri, 9 Dec 2016 14:54:28 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1

On 09.12.2016 02:42, Gonglei (Arei) wrote:
> Hi,
> 
>>
>> Subject: Re: [Qemu-devel] [PATCH for-2.9 v2] virtio-crypto: zeroize the key
>> material before free
>>
>> On 08.12.2016 16:23, Eric Blake wrote:
>>> On 12/07/2016 08:28 PM, Gonglei (Arei) wrote:
>>>
>>>>> As far as I'm aware, other projects usually have a special memset
>>>>> variation for doing this. That is because compilers may choose to
>>>>> "optimize" memset(p, ...) + free(p) to just the free(p). Having a
>>>>
>>>> Actually, I googled this, but I didn't find a definite answer. And
>>>>
>>>> The Linux kernel uses kzfree instead of memset + kfree
>> (mm/slab_common.c).
>>
>> Well, I personally don't mind whether we use a custom zfree() or a
>> custom memset_s() + free(). I only mind that this patch actually always
>> does what it is intended to do.
>>
> Yes, but why linux kernel to think about the compiler optimization for
> memset for sensitive data?

I'm afraid I don't quite (syntactically) understand this question. Do
you mean to ask why the Linux kernel would have to think about this
optimization? My answer to that would be because the optimization of
memset() + free() is known, and they probably want to protect against
the compiler optimizing it even with -ffreestanding and differently
called functions -- you never know.

Related example: gcc detects code that basically does a memset() and
replaces it with a call to memset(). There were a couple of versions
where it did that even with -ffreestanding or -fno-builtin, which is bad
if you want to actually write a memset(). They fixed it by now (so that
-ffreestanding and -fno-builtin will imply
-fno-tree-loop-distribute-patterns, which will disable that optimization.

Now imagine the same case here: Maybe at some point gcc is able to
detect that kfree() is basically free() and will then optimize the
memset() before kfree() away. A couple of versions later, somebody will
notice that -- but at that point, the damage has been done already and
there are compiled versions out which leak sensitive data somewhere.

I think this is why the Linux kernel decides to be proactive and use
kzfree() from the start, and this is also why I'm proposing to not delay
this issue in qemu until some compiler actually makes it a real issue.

>>> If we're worried about cleaning things without allowing the compiler a
>>> chance to optimize, then writing our own qemu_zfree() wrapper may indeed
>>> make sense.  But that won't cover the case in Daniel's earlier patch
>>> (referenced elsewhere in this thread), as that was zeroizing stack
>>> memory (before it went out of scope) rather than heap memory (before
>>> free).  So you'd still need some sort of 'write this memory no matter
>>> what' primitive that would be directly usable on stack memory and
>>> indirectly used as part of the qemu_zfree() wrapper.
>>>
> If we wrap a secure_memset(), then both stack memory and heap
> memory can use it.
> 
> Pls see:
> 
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1381.pdf
> 
> But the secure_memset() is not as efficient as possible due to the nature of 
> the volatile
> type qualifier preventing the compiler from optimizing the code at all.
> 
> It will cause huge performance regression at hot path of data plane...

I would suggest we implement an own secure_memset() or secure_bzero()
which falls back on what we have:
- memset_s(), if that is available;
- explizit_bzero() if we have libbsd;
- SecureZeroMemory() on Windows

Or we have to write it ourselves:
https://sourceware.org/ml/libc-alpha/2014-12/msg00506.html suggests that
putting a full memory barrier after the memset() should be enough.

>>> But I wouldn't worry about it for now, unless someone proves we actually
>>> have a compiler optimizing away the cleanups.
>>
>> It's true that we don't have to worry about it *now*, but the approach
>> of waiting until the compiler breaks it does not seem right to me.
>>
>> If at some point some compiler recognizes g_free() as being the same
>> function as free() and thus optimizes memset() + g_free(), we simply
>> won't notice because nobody will keep track of the generated assembly
>> output all the time.
>>
>> There is a reason C11 introduced memset_s(), and it is this.
>>
> ...does it make sense if we introduce secure_memset() now ? 
> 
> Waiting for your feedback. Thanks!

I would propose so; or better something like secure_bzero() or
secure_memzero(), because we can then fall back on existing
implementations like explizit_bzero() or SecureZeroMemory().

We don't have to implement this immediately, but we should do it before
the 2.9 release.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]