|
Message-Id: <20120410125443.3ab1a277.akpm@linux-foundation.org> Date: Tue, 10 Apr 2012 12:54:43 -0700 From: Andrew Morton <akpm@...ux-foundation.org> To: "Indan Zupancic" <indan@....nu> Cc: "Will Drewry" <wad@...omium.org>, linux-kernel@...r.kernel.org, linux-security-module@...r.kernel.org, linux-arch@...r.kernel.org, linux-doc@...r.kernel.org, kernel-hardening@...ts.openwall.com, netdev@...r.kernel.org, x86@...nel.org, arnd@...db.de, davem@...emloft.net, hpa@...or.com, mingo@...hat.com, oleg@...hat.com, peterz@...radead.org, rdunlap@...otime.net, mcgrathr@...omium.org, tglx@...utronix.de, luto@....edu, eparis@...hat.com, serge.hallyn@...onical.com, djm@...drot.org, scarybeasts@...il.com, pmoore@...hat.com, corbet@....net, eric.dumazet@...il.com, markus@...omium.org, coreyb@...ux.vnet.ibm.com, keescook@...omium.org, jmorris@...ei.org Subject: Re: [PATCH v17 08/15] seccomp: add system call filtering using BPF On Mon, 9 Apr 2012 04:22:40 +1000 "Indan Zupancic" <indan@....nu> wrote: > On Sat, April 7, 2012 06:23, Andrew Morton wrote: > > hm, I'm surprised that we don't have a zero-returning implementation of > > is_compat_task() when CONFIG_COMPAT=n. Seems silly. Blames Arnd. > > It's sneakily hidden at the end of compat.h. > > >> +/** > >> + * get_u32 - returns a u32 offset into data > >> + * @data: a unsigned 64 bit value > >> + * @index: 0 or 1 to return the first or second 32-bits > >> + * > >> + * This inline exists to hide the length of unsigned long. > >> + * If a 32-bit unsigned long is passed in, it will be extended > >> + * and the top 32-bits will be 0. If it is a 64-bit unsigned > >> + * long, then whatever data is resident will be properly returned. > >> + */ > >> +static inline u32 get_u32(u64 data, int index) > >> +{ > >> + return ((u32 *)&data)[index]; > >> +} > > > > This seems utterly broken on big-endian machines. If so: fix. If not: > > add comment explaining why? > > It's not a bug, it's intentional. Well it looks like a bug, which is why I suggest that it be clearly commented. > > > >> + if (total_insns > MAX_INSNS_PER_PATH) > >> + return -ENOMEM; > >> + > >> + /* > >> + * Installing a seccomp filter requires that the task have > >> + * CAP_SYS_ADMIN in its namespace or be running with no_new_privs. > >> + * This avoids scenarios where unprivileged tasks can affect the > >> + * behavior of privileged children. > >> + */ > >> + if (!current->no_new_privs && > >> + security_capable_noaudit(current_cred(), current_user_ns(), > >> + CAP_SYS_ADMIN) != 0) > >> + return -EACCES; > >> + > >> + /* Allocate a new seccomp_filter */ > >> + filter = kzalloc(sizeof(struct seccomp_filter) + fp_size, GFP_KERNEL); > > > > I think this gives userspace an easy way of causing page allocation > > failure warnings, by permitting large kmalloc() attempts. Add > > __GFP_NOWARN? > > Max is 32kb. sk_attach_filter() in net/core/filter.c is worse, > it allocates up to 512kb before even checking the length. An order-3 allocation attempt is pretty fragile. This will sometimes fail. > What about using GFP_USER (and adding __GFP_NOWARN to GFP_USER) instead? Let's be conventional and use the open-coded __GFP_NOWARN. __GFP_NOWARN says "this is a big allocation which will sometimes fail and I have carefully reviewed the failure paths and runtime tested them". Please carefully review the failure paths and runtime test them ;) > >> + /* Check and rewrite the fprog via the skb checker */ > >> + ret = sk_chk_filter(filter->insns, filter->len); > >> + if (ret) > >> + goto fail; > >> + > >> + /* Check and rewrite the fprog for seccomp use */ > >> + ret = seccomp_chk_filter(filter->insns, filter->len); > > > > "check" is spelled "check"! > > Yes, it is and he did spell "check" as "Check". > > seccomp_chk_filter() mirrors sk_chk_filter(). So it refers to > "chk", not "check". bah. Two poor identifiers isn't better than one. Whatever.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.