|
From: | Corey Bryant |
Subject: | Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c |
Date: | Tue, 19 Jun 2012 12:51:21 -0400 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0 |
On 06/19/2012 11:37 AM, Will Drewry wrote:
On Tue, Jun 19, 2012 at 8:35 AM, Corey Bryant <address@hidden> wrote:On 06/18/2012 06:14 PM, Will Drewry wrote:[-all] On Mon, Jun 18, 2012 at 4:53 PM, Corey Bryant <address@hidden> wrote:On 06/18/2012 04:18 PM, Blue Swirl wrote:On Mon, Jun 18, 2012 at 3:22 PM, Corey Bryant <address@hidden> wrote:On 06/18/2012 04:33 AM, Daniel P. Berrange wrote:On Fri, Jun 15, 2012 at 07:04:45PM +0000, Blue Swirl wrote:On Wed, Jun 13, 2012 at 8:33 PM, Daniel P. Berrange <address@hidden> wrote:On Wed, Jun 13, 2012 at 07:56:06PM +0000, Blue Swirl wrote:On Wed, Jun 13, 2012 at 7:20 PM, Eduardo Otubo <address@hidden> wrote:I added a syscall struct using priority levels as described in the libseccomp man page. The priority numbers are based to the frequency they appear in a sample strace from a regular qemu guest run under libvirt. Libseccomp generates linear BPF code to filter system calls, those rules are read one after another. The priority system places the most common rules first in order to reduce the overhead when processing them. Also, since this is just a first RFC, the whitelist is a little raw. We might need your help to improve, test and fine tune the set of system calls. v2: Fixed some style issues Removed code from vl.c and created qemu-seccomp.[ch] Now using ARRAY_SIZE macro Added more syscalls without priority/frequency set yet Signed-off-by: Eduardo Otubo <address@hidden> --- qemu-seccomp.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ qemu-seccomp.h | 9 +++++++ vl.c | 7 ++++++ 3 files changed, 89 insertions(+) create mode 100644 qemu-seccomp.c create mode 100644 qemu-seccomp.h diff --git a/qemu-seccomp.c b/qemu-seccomp.c new file mode 100644 index 0000000..048b7ba --- /dev/null +++ b/qemu-seccomp.c @@ -0,0 +1,73 @@Copyright and license info missing.+#include <stdio.h> +#include <seccomp.h> +#include "qemu-seccomp.h" + +static struct QemuSeccompSyscall seccomp_whitelist[] = {'const'+ { SCMP_SYS(timer_settime), 255 }, + { SCMP_SYS(timer_gettime), 254 }, + { SCMP_SYS(futex), 253 }, + { SCMP_SYS(select), 252 }, + { SCMP_SYS(recvfrom), 251 }, + { SCMP_SYS(sendto), 250 }, + { SCMP_SYS(read), 249 }, + { SCMP_SYS(brk), 248 }, + { SCMP_SYS(clone), 247 }, + { SCMP_SYS(mmap), 247 }, + { SCMP_SYS(mprotect), 246 }, + { SCMP_SYS(ioctl), 245 }, + { SCMP_SYS(recvmsg), 245 }, + { SCMP_SYS(sendmsg), 245 }, + { SCMP_SYS(accept), 245 }, + { SCMP_SYS(connect), 245 }, + { SCMP_SYS(bind), 245 },It would be nice to avoid connect() and bind(). Perhaps seccomp init should be postponed to after all sockets have been created?If you want to migrate your guest, you need to be able to call connect() at an arbitrary point in the QEMU process' lifecycle. So you can't avoid allowing connect(). Similarly if you want to allow hotplug of NICs (and their backends) then you need to have both bind() + connect() available.That's bad. Migration could conceivably be extended to use file descriptor passing, but hotplug is more tricky.As with execve(), i'm reporting this on the basis that on the previous patch posting I was told we must whitelist any syscalls QEMU can conceivably use to avoid any loss in functionality.Thanks for pointing out syscalls needed for the whitelist. As Paul has already mentioned, it was recommended that we restrict all of QEMU (as a single process) from the start of execution. This is opposed to other options of restricting QEMU from the time that vCPUS start, further restricting based on syscall parms, or decomposing QEMU into multiple processes that are individually restricted with their own seccomp whitelists.Can each thread have separate seccomp whitelists? For example CPU threads should not need pretty much anything but the I/O thread needs I/O.No, seccomp filters are defined and enforced at the process level.I'll keep lurking :) especially since I don't know the internals of qemu well, but you can do per-thread seccomp filters since processes==threads on linux. The real risk is that threads share so much that an attack on the CPU thread may be able to parlay that into a syscall proxy on a another thread. Probably what would make sense in that way is a loose global filter, then have each sub-thread install a functionality specific second filter. I may be way off base though, so feel free to just tell me to keep lurking :) Thanks again for all the support and for pushing hard to get this functionality in qemu!Please keep lurking! I appreciate the input and education. :) So whether it's a thread or process, I assume it will have its own a task_struct, allowing us to set a filter per thread or per process. The difference being that threads share more resources than processes. Sort of thinking out loud here to see if I'm right.Exactly!It doesn't seem ideal vs process separation, but it's do-able.Yep -- so for something like qemu, you could install a global baseline policy (e.g., union of all needed syscalls) then for each thread, they can install a more restrictive set. The actual security guarantees will be the total synthesis because of cross-thread attacks, but it would make exploitation pretty painful. If you want better guarantees, then process separation is needed. One option is even doing brokering for complex syscalls using either ptrace or a sigsys handler, but that is likely too much to get into while establishing a baseline.
In response to "Can each thread have separate seccomp whitelists?" please take a look at the thread above from Will Drewry. seccomp *can* be used per thread. However, it's not ideal vs per process seccomp filters.
-- Regards, Corey
You don't mind if I share your input with the others, do you?Of course not! cheers!-- Regards, CoreyI think this approach is a good starting point that can be further tuned in the future. And as with most security measures, defense in depth improves the cause (e.g. combining seccomp with DAC or MAC).Agreed.-- Regards, Corey
[Prev in Thread] | Current Thread | [Next in Thread] |