|
Message-ID: <c1977923-f68d-6ea9-2fd1-d2939848eec6@nmatt.com> Date: Sat, 22 Apr 2017 13:09:42 -0400 From: Matt Brown <matt@...tt.com> To: "Serge E. Hallyn" <serge@...lyn.com> Cc: jmorris@...ei.org, gregkh@...uxfoundation.org, jslaby@...e.com, akpm@...ux-foundation.org, jannh@...gle.com, keescook@...omium.org, kernel-hardening@...ts.openwall.com, linux-security-module@...r.kernel.org, linux-kernel@...r.kernel.org Subject: Re: [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN On 04/21/2017 01:24 AM, Serge E. Hallyn wrote: > On Fri, Apr 21, 2017 at 01:09:59AM -0400, Matt Brown wrote: >> On 04/20/2017 01:41 PM, Serge E. Hallyn wrote: >>> Quoting matt@...tt.com (matt@...tt.com): >>>> On 2017-04-20 11:19, Serge E. Hallyn wrote: >>>>> Quoting Matt Brown (matt@...tt.com): >>>>>> On 04/19/2017 07:53 PM, Serge E. Hallyn wrote: >>>>>>> Quoting Matt Brown (matt@...tt.com): >>>>>>>> On 04/19/2017 12:58 AM, Serge E. Hallyn wrote: >>>>>>>>> On Tue, Apr 18, 2017 at 11:45:26PM -0400, Matt Brown wrote: >>>>>>>>>> This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity >>>>>>>>>> project in-kernel. >>>>>>>>>> >>>>>>>>>> This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding >>>>>>>>>> sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI >>>>>>>>>> ioctl calls from non CAP_SYS_ADMIN users. >>>>>>>>>> >>>>>>>>>> Possible effects on userland: >>>>>>>>>> >>>>>>>>>> There could be a few user programs that would be effected by this >>>>>>>>>> change. >>>>>>>>>> See: <https://codesearch.debian.net/search?q=ioctl%5C%28.*TIOCSTI> >>>>>>>>>> notable programs are: agetty, csh, xemacs and tcsh >>>>>>>>>> >>>>>>>>>> However, I still believe that this change is worth it given that the >>>>>>>>>> Kconfig defaults to n. This will be a feature that is turned on for the >>>>>>>>> >>>>>>>>> It's not worthless, but note that for instance before this was fixed >>>>>>>>> in lxc, this patch would not have helped with escapes from privileged >>>>>>>>> containers. >>>>>>>>> >>>>>>>> >>>>>>>> I assume you are talking about this CVE: >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1411256 >>>>>>>> >>>>>>>> In retrospect, is there any way that an escape from a privileged >>>>>>>> container with the this bug could have been prevented? >>>>>>> >>>>>>> I don't know, that's what I was probing for. Detecting that the pgrp >>>>>>> or session - heck, the pid namespace - has changed would seem like a >>>>>>> good indicator that it shouldn't be able to push. >>>>>>> >>>>>> >>>>>> pgrp and session won't do because in the case we are discussing >>>>>> current->signal->tty is the same as tty. >>>>>> >>>>>> This is the current check that is already in place: >>>>>> | if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN)) >>>>>> | return -EPERM; >>>>> >>>>> Yeah... >>>>> >>>>>> The only thing I could find to detect the tty message coming from a >>>>>> container is as follows: >>>>>> | task_active_pid_ns(current)->level >>>>>> >>>>>> This will be zero when run on the host, but 1 when run inside a >>>>>> container. However this is very much a hack and could probably break >>>>>> some userland stuff where there are multiple levels of namespaces. >>>>> >>>>> Yes. This is also however why I don't like the current patch, because >>>>> capable() will never be true in a container, so nested containers >>>>> break. >>>>> >>>> >>>> What do you mean by "capable() will never be true in a container"? >>>> My understanding >>>> is that if a container is given CAP_SYS_ADMIN then >>>> capable(CAP_SYS_ADMIN) will return >>>> true? >>> >>> No, capable(X) checks for X with respect to the initial user namespace. >>> So for root-owned containers it will be true, but containers running in >>> non-initial user namespaces cannot pass that check. >>> >>> To check for privilege with respect to another user namespace, you need >>> to use ns_capable. But for that you need a user_ns to target. >>> >> >> How about: ns_capable(current_user_ns(),CAP_SYS_ADMIN) ? >> >> current_user_ns() was found in include/linux/cred.h > > Any user can create a new user namespace and pass the above check. What we > want is to find the user namespace which opened the tty. > I believe I have a working solution that I can show in the next version of the patch later today, but I just want to run the logic by you first. I added: "struct user_namespace *owner_user_ns;" as a field in tty_struct (include/linux/tty.h) Note: I am totally open to suggestions for a better name. Then I added "tty->owner_user_ns = current_user_ns();" to the alloc_tty_struct function. (drivers/tty/tty_io.c) When testing with a docker container, running in a different user namespace, I printed out current_user_ns()->level, which returned 1, and tty->owner_user_ns->level, which returned 0. This seems to prove that I am correctly storing the user namespace which opened the tty. Please let me know if there are any edge cases that I am missing with this approach.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.