Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201102175526.eu4npm4v2ggicvaf@wittgenstein>
Date: Mon, 2 Nov 2020 18:55:26 +0100
From: Christian Brauner <christian.brauner@...ntu.com>
To: Alexey Gladkov <gladkov.alexey@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
	Linux Containers <containers@...ts.linux-foundation.org>,
	Kernel Hardening <kernel-hardening@...ts.openwall.com>,
	Alexey Gladkov <legion@...nel.org>,
	"Eric W . Biederman" <ebiederm@...ssion.com>,
	Kees Cook <keescook@...omium.org>,
	Christian Brauner <christian@...uner.io>
Subject: Re: [RFC PATCH v1 0/4] Per user namespace rlimits

On Mon, Nov 02, 2020 at 05:50:29PM +0100, Alexey Gladkov wrote:
> Preface
> -------
> These patches are for binding the rlimits to a user in the user namespace.
> This patch set can be applied on top of:
> 
> git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v5.8-2-g43e210d68200
> 
> Problem
> -------
> Some rlimits are set per user: RLIMIT_NPROC, RLIMIT_MEMLOCK, RLIMIT_SIGPENDING,
> RLIMIT_MSGQUEUE. When several containers are created from one user then
> the processes inside the containers influence each other.
> 
> Eric W. Biederman mentioned this issue [1][2][3].
> 
> Introduced changes
> ------------------
> To fix this problem, you can bind the counter of the specified rlimits to the
> user within the user namespace. By default, to preserve backward compatibility,
> only the initial user namespace is used. This patch adds one more prctl
> parameter to change the binding to the user namespace.
> 
> This will not cause the user to take more resources than allowed in the parent
> user namespace because it only virtualizes the rlimit counter. Limits in all
> parent user namespaces are taken into account.
> 
> For example, this allows us to run multiple containers by the same user and
> set the RLIMIT_NPROC to 1 inside.

Thanks for picking this up and working on it. This would definitely fix
many issues for folks running unprivileged containers using a single id
map which is the default behavior for LXC/LXD and so very valuable to
us.

Christian

> 
> ToDo
> ----
> * RLIMIT_MEMLOCK, RLIMIT_SIGPENDING and RLIMIT_MSGQUEUE are not implemented.
> * No documentation.
> * No tests.
> 
> [1] https://lore.kernel.org/containers/87imd2incs.fsf@x220.int.ebiederm.org/
> [2] https://lists.linuxfoundation.org/pipermail/containers/2020-August/042096.html
> [3] https://lists.linuxfoundation.org/pipermail/containers/2020-October/042524.html
> 
> Changelog
> ---------
> v1:
> * After discussion with Eric W. Biederman, I increased the size of ucounts to
>   atomic_long_t.
> * Added ucount_max to avoid the fork bomb.
> 
> --
> 
> Alexey Gladkov (4):
>   Increase size of ucounts to atomic_long_t
>   Move the user's process counter to ucounts
>   Do not allow fork if RLIMIT_NPROC is exceeded in the user namespace
>     tree
>   Allow to change the user namespace in which user rlimits are counted
> 
>  fs/exec.c                      | 13 ++++++---
>  fs/io-wq.c                     | 25 +++++++++++++-----
>  fs/io-wq.h                     |  1 +
>  fs/io_uring.c                  |  1 +
>  include/linux/cred.h           |  8 ++++++
>  include/linux/sched.h          |  3 +++
>  include/linux/sched/user.h     |  1 -
>  include/linux/user_namespace.h | 12 +++++++--
>  include/uapi/linux/prctl.h     |  5 ++++
>  kernel/cred.c                  | 44 ++++++++++++++++++++++++-------
>  kernel/exit.c                  |  2 +-
>  kernel/fork.c                  | 13 ++++++---
>  kernel/sys.c                   | 26 ++++++++++++++++--
>  kernel/ucount.c                | 48 +++++++++++++++++++++++++++++-----
>  kernel/user.c                  |  3 ++-
>  kernel/user_namespace.c        |  3 +++
>  16 files changed, 171 insertions(+), 37 deletions(-)
> 
> -- 
> 2.25.4
> 

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.