[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2] Add argument filters to the seccomp sandbox
From: |
Eduardo Otubo |
Subject: |
Re: [Qemu-devel] [PATCH v2] Add argument filters to the seccomp sandbox |
Date: |
Thu, 24 Sep 2015 11:59:52 +0200 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Thu, Sep 10, 2015 at 08=54=28PM -0400, address@hidden wrote:
> > The current intention of the seccomp filter in QEMU, is that /all/ existing
> > QEMU features continue to work unchanged. So even if a flag is used in a
> > seemingly uncommon code path, we still need to allow that in a seccomp
> > filter.
> It already doesn't work very well, e.g. with -chroot, it fails because
> chroot()
> is not whitelisted, same with -runas because setgid() etc isn't whitelisted.
> Maybe there should be an extra option for -sandbox, like "-sandbox
> experimental"
> which does argument filtering, which of course may break something, and the
> old
> behavior would do plain syscall filtering without caring about arguments,
> because
> that's so much easier to guarantee to work, even if it provides little
> security.
Can you point out which exact use case breaks if you don't whitelist the
below mentioned system calls' flags?
>
> We could also change the default behavior from SCMP_ACT_KILL (which kills the
> entire thing as soon as a single violation occurs) to SCMP_ACT_ERRNO(EPERM),
> which
> will just return EPERM for a syscall with a violation. The software will be
> much
> more capable of handling a permission denied error without crashing. Although
> of
> course that violates the principle of fast-fail.
We thought about this in beggining of the development of seccomp on
qemu. Some feature like allow all, which would print to stderr all
illegal hits and a another argument like
-sandbox_add="syscall1,syscall2", but this would be against the concept
of the whole security schema. We don't want the user to take full
control of it, and if you're a developer, you know what to do.
>
> > So we need to add DODUMP, DONTDUMP, UNMERGABLE and WILLNEED here. That
> > is still stricter than the previous allow-everything rule, so a net
> > win.
> And MADV_INVALID too I assume? That was one of the others I got with grep.
>
> > > +
> > > + /* shmget */
> > > + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmget), 2,
> > > + SCMP_A0(SCMP_CMP_EQ, IPC_PRIVATE),
> > > + SCMP_A2(SCMP_CMP_EQ, IP_CREAT|0777));
> >
> > I'm not familiar with semantics of these seccomp rules, but is this
> > saying that the second arg must be exactly equal to IP_CREAT|0777 ?
> > If the app passes IP_CREAT|0600, would that be permitted instead ?
> > The latter is what I see gtk2 source code passing for mode.
> Argument 2 must be exactly equal to IP_CREAT|0777, yes, otherwise Qemu dies
> with
> SIGSYS. I did check the Qemu source and saw 0777 was harcoded. Does the 0600
> mask
> in GTK2 ever get called in Qemu? Anyway I added the MADV flags and the 0600
> mask
> in the v2 patch.
>
> Signed-off-by: Namsun Ch'o <address@hidden>
> ---
> diff --git a/qemu-seccomp.c b/qemu-seccomp.c
> index f9de0d3..a353ef9 100644
> --- a/qemu-seccomp.c
> +++ b/qemu-seccomp.c
> @@ -14,6 +14,8 @@
> */
> #include <stdio.h>
> #include <seccomp.h>
> +#include <linux/ipc.h>
> +#include <asm-generic/mman-common.h>
> #include "sysemu/seccomp.h"
>
> struct QemuSeccompSyscall {
> @@ -105,7 +107,6 @@ static const struct QemuSeccompSyscall
> seccomp_whitelist[] = {
> { SCMP_SYS(rt_sigreturn), 245 },
> { SCMP_SYS(sync), 245 },
> { SCMP_SYS(pread64), 245 },
> - { SCMP_SYS(madvise), 245 },
> { SCMP_SYS(set_robust_list), 245 },
> { SCMP_SYS(lseek), 245 },
> { SCMP_SYS(pselect6), 245 },
> @@ -224,11 +225,9 @@ static const struct QemuSeccompSyscall
> seccomp_whitelist[] = {
> { SCMP_SYS(arch_prctl), 240 },
> { SCMP_SYS(mkdir), 240 },
> { SCMP_SYS(fchmod), 240 },
> - { SCMP_SYS(shmget), 240 },
> { SCMP_SYS(shmat), 240 },
> { SCMP_SYS(shmdt), 240 },
> { SCMP_SYS(timerfd_create), 240 },
> - { SCMP_SYS(shmctl), 240 },
> { SCMP_SYS(mlockall), 240 },
> { SCMP_SYS(mlock), 240 },
> { SCMP_SYS(munlock), 240 },
> @@ -264,6 +263,60 @@ int seccomp_start(void)
> }
> }
>
> + /* madvise */
> + static const int madvise_flags[] = {
> + MADV_DODUMP,
> + MADV_DONTDUMP,
> + MADV_INVALID,
> + MADV_UNMERGEABLE,
> + MADV_WILLNEED,
> + MADV_DONTFORK,
> + MADV_DONTNEED,
> + MADV_HUGEPAGE,
> + MADV_MERGEABLE,
> + };
> + for (i = 0; i < ARRAY_SIZE(madvise_flags); i++) {
> + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(madvise), 1,
> + SCMP_A2(SCMP_CMP_EQ, madvise_flags[i]));
> + if (rc < 0) {
> + goto seccomp_return;
> + }
> + }
> + rc = seccomp_syscall_priority(ctx, SCMP_SYS(madvise), 245);
> + if (rc < 0) {
> + goto seccomp_return;
> + }
> +
> + /* shmget */
> + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmget), 2,
> + SCMP_A0(SCMP_CMP_EQ, IPC_PRIVATE),
> + SCMP_A2(SCMP_CMP_EQ, IP_CREAT|0777));
> + if (rc < 0) {
> + goto seccomp_return;
> + }
> + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmget), 2,
> + SCMP_A0(SCMP_CMP_EQ, IPC_PRIVATE),
> + SCMP_A2(SCMP_CMP_EQ, IP_CREAT|0600));
Isn't it IPC_CREAT? Or am I missing something?
Can you resend a v3 describing the changes you did from v1 to v2 and v3?
This helps keep tracking of ideas and discussions.
Thanks a lot for the contribution!
> + if (rc < 0) {
> + goto seccomp_return;
> + }
> + rc = seccomp_syscall_priority(ctx, SCMP_SYS(shmget), 240);
> + if (rc < 0) {
> + goto seccomp_return;
> + }
> +
> + /* shmctl */
> + rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmctl), 2,
> + SCMP_A1(SCMP_CMP_EQ, IPC_RMID),
> + SCMP_A2(SCMP_CMP_EQ, 0));
> + if (rc < 0) {
> + goto seccomp_return;
> + }
> + rc = seccomp_syscall_priority(ctx, SCMP_SYS(shmctl), 240);
> + if (rc < 0) {
> + goto seccomp_return;
> + }
> +
> rc = seccomp_load(ctx);
>
> seccomp_return:
--
Eduardo Otubo
ProfitBricks GmbH
signature.asc
Description: Digital signature