kernel-hardening - Re: [PATCH v12 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180518065349.GA10080@gmail.com>
Date: Fri, 18 May 2018 08:53:49 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Alexander Popov <alex.popov@...ux.com>
Cc: kernel-hardening@...ts.openwall.com, Kees Cook <keescook@...omium.org>,
	PaX Team <pageexec@...email.hu>,
	Brad Spengler <spender@...ecurity.net>,
	Andy Lutomirski <luto@...nel.org>, Tycho Andersen <tycho@...ho.ws>,
	Laura Abbott <labbott@...hat.com>,
	Mark Rutland <mark.rutland@....com>,
	Ard Biesheuvel <ard.biesheuvel@...aro.org>,
	Borislav Petkov <bp@...en8.de>,
	Richard Sandiford <richard.sandiford@....com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H . Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Dmitry V . Levin" <ldv@...linux.org>,
	Emese Revfy <re.emese@...il.com>, Jonathan Corbet <corbet@....net>,
	Andrey Ryabinin <aryabinin@...tuozzo.com>,
	"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
	Thomas Garnier <thgarnie@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Alexei Starovoitov <ast@...nel.org>, Josef Bacik <jbacik@...com>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Nicholas Piggin <npiggin@...il.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	"David S . Miller" <davem@...emloft.net>,
	Ding Tianhong <dingtianhong@...wei.com>,
	David Woodhouse <dwmw@...zon.co.uk>,
	Josh Poimboeuf <jpoimboe@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Dominik Brodowski <linux@...inikbrodowski.net>,
	Juergen Gross <jgross@...e.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Dan Williams <dan.j.williams@...el.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Mathias Krause <minipli@...glemail.com>,
	Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
	Kyle Huey <me@...ehuey.com>,
	Dmitry Safonov <dsafonov@...tuozzo.com>,
	Will Deacon <will.deacon@....com>, Arnd Bergmann <arnd@...db.de>,
	Florian Weimer <fweimer@...hat.com>,
	Boris Lukashev <blukashev@...pervictus.com>,
	Andrey Konovalov <andreyknvl@...gle.com>, x86@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v12 2/6] x86/entry: Add STACKLEAK erasing the kernel
 stack at the end of syscalls


* Alexander Popov <alex.popov@...ux.com> wrote:

> --- a/arch/x86/entry/calling.h
> +++ b/arch/x86/entry/calling.h
> @@ -329,8 +329,22 @@ For 32-bit we have the following conventions - kernel is built with
>  
>  #endif
>  
> +.macro ERASE_KSTACK_NOCLOBBER
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +	PUSH_AND_CLEAR_REGS
> +	call erase_kstack
> +	POP_REGS
> +#endif
> +.endm
> +
>  #endif /* CONFIG_X86_64 */
>  
> +.macro ERASE_KSTACK
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +	call erase_kstack
> +#endif
> +.endm

Please use a well-organized, common, visually easy to ignore namespace.

For example:

	STACKLEAK_ERASE_NOCLOBBER

> @@ -298,6 +300,7 @@ ENTRY(ret_from_fork)
>  	/* When we fork, we trace the syscall return in the child, too. */
>  	movl    %esp, %eax
>  	call    syscall_return_slowpath
> +	ERASE_KSTACK

Ditto:

	STACKLEAK_ERASE

etc.

> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -32,6 +32,7 @@ struct vm86;
>  #include <linux/err.h>
>  #include <linux/irqflags.h>
>  #include <linux/mem_encrypt.h>
> +#include <linux/stackleak.h>

>  	mm_segment_t		addr_limit;
>  
> +	struct lowest_stack	lowest_stack;

This too should be something more organized and more opaque, like:

	struct stackleak_info	stackleak_info;

And the field name should not be a meaningless 'val', but 'lowest_stack'.

I.e. "p->stackleak_info.lowest_stack", which is so much more informative ...

> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -136,6 +136,11 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
>  	p->thread.sp0 = (unsigned long) (childregs+1);
>  	memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
>  
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +	p->thread.lowest_stack.val = (unsigned long)end_of_stack(p) +
> +						sizeof(unsigned long);
> +#endif

This should use an inline helper:

	stackleak_task_init(p);

> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +	p->thread.lowest_stack.val = (unsigned long)end_of_stack(p) +
> +						sizeof(unsigned long);
> +#endif

Beyond the lower visual impact this duplication will be removed by the inline 
helper as well.

> +++ b/kernel/stackleak.c
> @@ -0,0 +1,72 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This code fills the used part of the kernel stack with a poison value
> + * before returning to the userspace. It's a part of the STACKLEAK feature
> + * ported from grsecurity/PaX.
> + *
> + * Author: Alexander Popov <alex.popov@...ux.com>
> + *
> + * STACKLEAK reduces the information which kernel stack leak bugs can
> + * reveal and blocks some uninitialized stack variable attacks. Moreover,
> + * STACKLEAK blocks stack depth overflow caused by alloca (aka Stack Clash
> + * attack).
> + */

s/alloca
 /alloca()

> +#include <linux/bug.h>
> +#include <linux/sched.h>
> +#include <linux/stackleak.h>
> +#include <asm/linkage.h>

Yeah, so since processor.h includes stackleak.h I strongly doubt the stackleak.h 
inclusion is necessary here. Please review every header inclusion line and remove 
the unnecessary ones.

> +
> +asmlinkage void erase_kstack(void)

This too should be in the stackleak_*() namespace.

> +{
> +	/*
> +	 * It would be nice not to have p and boundary on the stack.
> +	 * Setting the register specifier for them is the best we can do.
> +	 */
> +	register unsigned long p = current->thread.lowest_stack.val;
> +	register unsigned long boundary = p & ~(THREAD_SIZE - 1);

Does the 'register' keyword actually have any effect on the generated code?

> +	unsigned long poison = 0;
> +	const unsigned long check_depth = STACKLEAK_POISON_CHECK_DEPTH /
> +							sizeof(unsigned long);

Please don't break lines in such an ugly fashion!

Also, 'poison' is a very weird name for something that looks like an index.

Plus since it's bound by "check_depth" is the 'unsigned long' justified,
or could it be 32-bit?

> +
> +	/*
> +	 * Let's search for the poison value in the stack.
> +	 * Start from the lowest_stack and go to the bottom.
> +	 */
> +	while (p > boundary && poison <= check_depth) {
> +		if (*(unsigned long *)p == STACKLEAK_POISON)
> +			poison++;
> +		else
> +			poison = 0;
> +
> +		p -= sizeof(unsigned long);
> +	}

This comment would be so much easier to read if the initialization was done right 
before the first use, i.e.:

	/*
	 * Let's search for the poison value in the stack.
	 * Start from the lowest_stack and go to the bottom:
	 */

	p = current->thread.lowest_stack.val;
	boundary = p & ~(THREAD_SIZE - 1);

	while (p > boundary && poison <= check_depth) {
		if (*(unsigned long *)p == STACKLEAK_POISON)
			poison++;
		else
			poison = 0;
	...

> +
> +	/*
> +	 * One long int at the bottom of the thread stack is reserved and
> +	 * should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK).
> +	 */
> +	if (p == boundary)
> +		p += sizeof(unsigned long);

Please put types into quotes where it's ambigous. I first read this sentence as 
"One long ..." and went "wtf". It's a totally unnecessary disruption of the 
reading flow.

> +	/*
> +	 * So let's write the poison value to the kernel stack.
> +	 * Start from the address in p and move up till the new boundary.
> +	 * We assume that the stack pointer doesn't change when we write poison.
> +	 */

Here too 'p' is easier to read.

But 'p' is a very weird name: in the kernel it's usually some sort of process 
pointer. Please rename it to something more descriptive, such as "kstack_ptr" or 
so.

> +	if (on_thread_stack())
> +		boundary = current_stack_pointer;
> +	else
> +		boundary = current_top_of_stack();
> +
> +	BUG_ON(boundary - p >= THREAD_SIZE);

Please make this:

	if ( WARN_ON_ONCE())
		return;

... or so, so that if this code is buggy we get actual useful user reports, not 
just "my machine froze, help!"...

> +	/* Reset the lowest_stack value for the next syscall */
> +	current->thread.lowest_stack.val = current_top_of_stack() - 256;

Magic, unexplained '256' literal.

Thanks,

	Ingo
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.