|
|
Message-ID: <CAGXu5j+d1nWs=HjMudPx9P=rbE5dSs3on3Gex=UeBcBNB7pEgA@mail.gmail.com>
Date: Fri, 10 Jun 2016 14:09:49 -0700
From: Kees Cook <keescook@...omium.org>
To: "kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>
Cc: Kees Cook <keescook@...omium.org>, Brad Spengler <spender@...ecurity.net>,
PaX Team <pageexec@...email.hu>, Casey Schaufler <casey.schaufler@...el.com>,
Rik van Riel <riel@...hat.com>, Christoph Lameter <cl@...ux.com>, Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>, Joonsoo Kim <iamjoonsoo.kim@....com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 2/4] usercopy: avoid direct copying to userspace
On Wed, Jun 8, 2016 at 2:11 PM, Kees Cook <keescook@...omium.org> wrote:
> Some non-whitelisted heap memory has small areas that need to be copied
> to userspace. For these cases, explicitly copy the needed contents out
> to stack first before sending to userspace. This lets their respective
> caches remain un-whitelisted (i.e. no SLAB_USERCOPY), since the bulk of
> their contents should not be exposed to userspace.
I've spent some time thinking about these kinds of
non-whitelisted-slab-workaround changes, and I would like to see if we
can design a better solution. So, to that end, here's what I see:
- HARDENED_USERCOPY verifies object addresses and sizes
- whitelisted caches (via HARDENED_USERCOPY_WHITELIST's SLAB_USERCOPY)
are intentionally rare
- Some code uses small parts of non-whitelisted cache memory for
userspace work (I think the auxv ("mm_struct") and signal frames
("task_struct") are good examples of this: neither cache should be
entirely exposed to userspace, yet tiny pieces are sent to userspace.)
- non-whitelist-workarounds are open-coded
- non-whitelist-workarounds require a double-copy
- non-whitelist-workarounds have explicit size maximums (e.g.
AT_VECTOR_SIZE, sizeof(sigset_t))
- non-whitelist-workarounds _bypass_ HARDENED_USERCOPY object address checking
So, while the workarounds do have a max-size sanity-check, they
actually lack the object address checking that would normally happen
with the usercopy checks. I think to solve the open-coding and
double-copy problems without compromising on the whitelisting or the
explicit size checking, we could also gain back the address checking
if we created something like:
copy_to_user_n(user, kernel, dynamic-size, const-max-size);
If "const-max-size" isn't detected as a builtin_constant it could fail
to build. When run, it would a) verify dynamic-size wasn't larger that
const-max-size, and b) perform the regular usercopy checks (without
the SLAB_USERCOPY check).
So, for the auxv example, instead of the new stack variable, the
memcpy, etc, it could just be a one-line change replacing the existing
copy_to_user() call:
copy_to_user_n(sp, elf_info, ei_index * sizeof(elf_addr_t), AT_VECTOR_SIZE);
(Bike-shedding: copy_to_user_bounded(), ..._limited(), ..._whitelist_hole(), ?)
What do people think?
>
> These changes, based on code by Brad Spengler and PaX Team, were extracted
> from grsecurity/PaX on a case-by-case basis as I ran into errors during
> testing of CONFIG_HARDENED_USERCOPY_WHITELIST:
>
> Changed exec to use copy of auxv instead of copying from "mm_struct" cache.
>
> Changed signal handling to use an on-stack copy of signal frames instead
> of direct copying from "task_struct" cache.
>
> Changed name_to_handle to use put_user (which operates on fixed sizes)
> instead of copy_to_user from "mnt_cache" cache.
>
> Changed readlink to use stack for in-struct names to avoid copying from
> filesystem cache (e.g. "ext4_inode_cache")
>
> Changed sg ioctl to bounce commands through stack instead of directly
> copying to/from "blkdev_requests" cache.
>
> Signed-off-by: Kees Cook <keescook@...omium.org>
> ---
> arch/x86/kernel/signal.c | 6 +++---
> block/scsi_ioctl.c | 27 +++++++++++++++++++++++++--
> fs/binfmt_elf.c | 8 +++++++-
> fs/fhandle.c | 3 +--
> fs/namei.c | 11 ++++++++++-
> 5 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index 22cc2f9f8aec..479fb2b9afcd 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -680,8 +680,8 @@ static int
> setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
> {
> int usig = ksig->sig;
> - sigset_t *set = sigmask_to_save();
> - compat_sigset_t *cset = (compat_sigset_t *) set;
> + sigset_t sigcopy = *sigmask_to_save();
> + compat_sigset_t *cset = (compat_sigset_t *)&sigcopy;
>
> /* Set up the stack frame */
> if (is_ia32_frame()) {
> @@ -692,7 +692,7 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
> } else if (is_x32_frame()) {
> return x32_setup_rt_frame(ksig, cset, regs);
> } else {
> - return __setup_rt_frame(ksig->sig, ksig, set, regs);
> + return __setup_rt_frame(ksig->sig, ksig, &sigcopy, regs);
> }
> }
>
> [...]
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index e158b22ef32f..84edd23d7fcc 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -167,6 +167,8 @@ create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
> int ei_index = 0;
> const struct cred *cred = current_cred();
> struct vm_area_struct *vma;
> + unsigned long saved_auxv[AT_VECTOR_SIZE];
> + size_t saved_auxv_size;
>
> /*
> * In some cases (e.g. Hyper-Threading), we want to avoid L1
> @@ -324,9 +326,13 @@ create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
> return -EFAULT;
> current->mm->env_end = p;
>
> + saved_auxv_size = ei_index * sizeof(elf_addr_t);
> + BUG_ON(saved_auxv_size > sizeof(saved_auxv));
> + memcpy(saved_auxv, elf_info, saved_auxv_size);
> +
> /* Put the elf_info on the stack in the right place. */
> sp = (elf_addr_t __user *)envp + 1;
> - if (copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t)))
> + if (copy_to_user(sp, saved_auxv, saved_auxv_size))
> return -EFAULT;
> return 0;
> }
> [...]
-Kees
--
Kees Cook
Chrome OS & Brillo Security
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.