|
Message-ID: <CABCJKudAiafvGk60oOjcZwcSHV69vGYZYpHaDD9HRgAuEx4jBw@mail.gmail.com> Date: Mon, 4 Nov 2019 10:25:24 -0800 From: Sami Tolvanen <samitolvanen@...gle.com> To: Mark Rutland <mark.rutland@....com> Cc: Will Deacon <will@...nel.org>, Catalin Marinas <catalin.marinas@....com>, Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>, Ard Biesheuvel <ard.biesheuvel@...aro.org>, Dave Martin <Dave.Martin@....com>, Kees Cook <keescook@...omium.org>, Laura Abbott <labbott@...hat.com>, Marc Zyngier <maz@...nel.org>, Nick Desaulniers <ndesaulniers@...gle.com>, Jann Horn <jannh@...gle.com>, Miguel Ojeda <miguel.ojeda.sandonis@...il.com>, Masahiro Yamada <yamada.masahiro@...ionext.com>, clang-built-linux <clang-built-linux@...glegroups.com>, Kernel Hardening <kernel-hardening@...ts.openwall.com>, linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>, LKML <linux-kernel@...r.kernel.org> Subject: Re: [PATCH v4 05/17] add support for Clang's Shadow Call Stack (SCS) On Mon, Nov 4, 2019 at 4:31 AM Mark Rutland <mark.rutland@....com> wrote: > > +/* > > + * In testing, 1 KiB shadow stack size (i.e. 128 stack frames on a 64-bit > > + * architecture) provided ~40% safety margin on stack usage while keeping > > + * memory allocation overhead reasonable. > > + */ > > +#define SCS_SIZE 1024 > > To make it easier to reason about type promotion rules (and avoid that > we accidentaly mask out high bits when using this to generate a mask), > can we please make this 1024UL? Sure. > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -6013,6 +6013,8 @@ void init_idle(struct task_struct *idle, int cpu) > > raw_spin_lock_irqsave(&idle->pi_lock, flags); > > raw_spin_lock(&rq->lock); > > > > + scs_task_reset(idle); > > Could we please do this next to the kasan_unpoison_task_stack() call, > Either just before, or just after? > > They're boot addressing the same issue where previously live stack is > being reused, and in general I'd expect them to occur at the same time > (though I understand idle will be a bit different). Good point, I'll move this. > > --- a/kernel/sched/sched.h > > +++ b/kernel/sched/sched.h > > @@ -58,6 +58,7 @@ > > #include <linux/profile.h> > > #include <linux/psi.h> > > #include <linux/rcupdate_wait.h> > > +#include <linux/scs.h> > > #include <linux/security.h> > > #include <linux/stop_machine.h> > > #include <linux/suspend.h> > > This include looks extraneous. I added this to sched.h, because most of the includes used in kernel/sched appear to be there, but I can move this to kernel/sched/core.c instead. > > +static inline void *__scs_base(struct task_struct *tsk) > > +{ > > + /* > > + * We allow architectures to use the shadow_call_stack field in > > + * struct thread_info to store the current shadow stack pointer > > + * during context switches. > > + * > > + * This allows the implementation to also clear the field when > > + * the task is active to avoid keeping pointers to the current > > + * task's shadow stack in memory. This can make it harder for an > > + * attacker to locate the shadow stack, but also requires us to > > + * compute the base address when needed. > > + * > > + * We assume the stack is aligned to SCS_SIZE. > > + */ > > How about: > > /* > * To minimize risk the of exposure, architectures may clear a > * task's thread_info::shadow_call_stack while that task is > * running, and only save/restore the active shadow call stack > * pointer when the usual register may be clobbered (e.g. across > * context switches). > * > * The shadow call stack is aligned to SCS_SIZE, and grows > * upwards, so we can mask out the low bits to extract the base > * when the task is not running. > */ > > ... which I think makes the lifetime and constraints a bit clearer. Sounds good to me, thanks. > > + return (void *)((uintptr_t)task_scs(tsk) & ~(SCS_SIZE - 1)); > > We usually use unsigned long ratehr than uintptr_t. Could we please use > that for consistency? > > The kernel relies on sizeof(unsigned long) == sizeof(void *) tree-wide, > so that doesn't cause issues for us here. > > Similarly, as suggested above, it would be easier to reason about this > knowing that SCS_SIZE is an unsigned long. While IIUC we'd get sign > extension here when it's promoted, giving the definition a UL suffix > minimizes the scope for error. OK, I'll switch to unsigned long. > > +/* Keep a cache of shadow stacks */ > > +#define SCS_CACHE_SIZE 2 > > How about: > > /* Matches NR_CACHED_STACKS for VMAP_STACK */ > #define NR_CACHED_SCS 2 > > ... which explains where the number came from, and avoids confusion that > the SIZE is a byte size rather than number of elements. Agreed, that sounds better. > > +static void scs_free(void *s) > > +{ > > + int i; > > + > > + for (i = 0; i < SCS_CACHE_SIZE; i++) > > + if (this_cpu_cmpxchg(scs_cache[i], 0, s) == 0) > > + return; > > Here we should compare to NULL rather than 0. Ack. > > +void __init scs_init(void) > > +{ > > + cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "scs:scs_cache", NULL, > > + scs_cleanup); > > We probably want to do something if this call fails. It looks like we'd > only leak two pages (and we'd be able to use them if/when that CPU is > brought back online. A WARN_ON() is probably fine. fork_init() in kernel/fork.c lets this fail quietly, but adding a WARN_ON seems fine. I will include these changes in v5. Sami
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.