kernel-hardening - Re: [PATCH v5 14/32] x86/mm/64: Enable vmapped stacks

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160714083411.GA15437@gmail.com>
Date: Thu, 14 Jul 2016 10:34:11 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Andy Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	linux-arch <linux-arch@...r.kernel.org>,
	Borislav Petkov <bp@...en8.de>, Nadav Amit <nadav.amit@...il.com>,
	Kees Cook <keescook@...omium.org>, Brian Gerst <brgerst@...il.com>,
	"kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Josh Poimboeuf <jpoimboe@...hat.com>, Jann Horn <jann@...jh.net>,
	Heiko Carstens <heiko.carstens@...ibm.com>
Subject: Re: [PATCH v5 14/32] x86/mm/64: Enable vmapped stacks


* Andy Lutomirski <luto@...capital.net> wrote:

> On Wed, Jul 13, 2016 at 12:53 AM, Ingo Molnar <mingo@...nel.org> wrote:
> >
> > * Andy Lutomirski <luto@...nel.org> wrote:
> >
> >> This allows x86_64 kernels to enable vmapped stacks.  There are a
> >> couple of interesting bits.
> >
> >> --- a/arch/x86/Kconfig
> >> +++ b/arch/x86/Kconfig
> >> @@ -92,6 +92,7 @@ config X86
> >>       select HAVE_ARCH_TRACEHOOK
> >>       select HAVE_ARCH_TRANSPARENT_HUGEPAGE
> >>       select HAVE_EBPF_JIT                    if X86_64
> >> +     select HAVE_ARCH_VMAP_STACK             if X86_64
> >
> > So what is the performance impact?
> 
> Seems to be a very slight speedup (0.5 µs or so) on my silly benchmark
> (pthread_create, pthread_join in a loop). [...]

Music to my ears - although TBH there's probably two opposing forces: advantages 
from the cache versus (possibly very minor, if measurable at all) disadvantages 
from the 4K granularity.

> [...]  It should be a small slowdown on workloads that create many threads all 
> at once, thus defeating the stack cache.  It should be a *large* speedup on any 
> workload that would trigger compaction on clone() to satisfy the high-order 
> allocation.
> 
> >
> > Because I think we should consider enabling this feature by default on x86 - but
> > the way it's selected here it will be default-off.
> >
> > On the plus side: the debuggability and reliability improvements are real and
> > making it harder for exploits to use kernel stack overflows is a nice bonus as
> > well. There's two performance effects:
> 
> Agreed.  At the very least, I want to wait until after net-next gets
> pulled to flip the default to y.  I'm also a bit concerned about more
> random driver issues that I haven't found yet.  I suppose we could
> flip the default to y for a few -rc releases and see what, if
> anything, shakes loose.

So I'd prefer the following approach: to apply it to a v4.8-rc1 base in ~2 weeks 
and keep it default-y for much of the next development cycle. If no serious 
problems are found in those ~2 months then send it to Linus in that fashion. We 
can still turn it off by default (or re-spin the whole approach) if it turns out 
to be too risky.

Exposing it as default-n for even a small amount of time will massively reduce the 
testing we'll get, as most people will just use the N setting (often without 
noticing).

Plus this also gives net-next and other preparatory patches applied directly to 
maintainer trees time to trickle upstream.

Thanks,

	Ingo

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.