|
Message-ID: <49abb4c2-e638-9657-f007-0c18ef6d0f28@linux.com> Date: Sat, 10 Jun 2017 02:00:18 +0300 From: Alexander Popov <alex.popov@...ux.com> To: Kees Cook <keescook@...omium.org> Cc: "kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>, PaX Team <pageexec@...email.hu>, Brad Spengler <spender@...ecurity.net>, Tycho Andersen <tycho@...ker.com> Subject: Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls On 09.06.2017 20:28, Kees Cook wrote: > On Fri, Jun 9, 2017 at 7:30 AM, Alexander Popov <alex.popov@...ux.com> wrote: >> Hello, >> >> My employer Positive Technologies kindly allowed me to spend a part of my >> working time on helping KSPP. So I join the initiative of porting STACKLEAK >> to the mainline, which was started by Tycho Andersen. > > Great, thanks for working on this! Hello Kees, thanks for your prompt reply. >> STACKLEAK is a security feature developed by Grsecurity/PaX (kudos to them!), >> which can mitigate the damage from kernel stack leak bugs (see CVE-2010-2963, >> CVE-2016-4569 and CVE-2016-4578). > > It might help to distinguish that this covers at least two classes of > issues: stack content exposure (the latter two CVEs above), and it can > block some uninitialized variable attacks (like the mentioned > CVE-2010-2963). In other words, this can be more than just an info > exposure defense. Thanks, I'll improve the description in the next version. >> I carefully extracted STACKLEAK from the last public patch of Grsecurity/PaX >> and do my best to understand it. So I added some comments describing that >> understanding. You are welcome to discuss it. > > Great, adding more comments to the gcc plugin will help others who are > trying to get up to speed on it (like me). :) Currently I annotated the arch-specific asm part of the STACKLEAK feature. However, there are a few questions in my comments marked with TODO. Playing with the gcc plugin is the next step. >> Here are the results of a brief performance test on x86_64: >> >> Hardware: Intel Core i7-4770, 16 GB RAM >> >> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s >> --metrics-brief >> Result on 4.11-rc8: >> cpu bogo ops 269955 >> iosync bogo ops 9809985 >> vm bogo ops 17093608 >> Result on 4.11-rc8+stackleak: >> cpu bogo ops 270106 (+0.6%) >> iosync bogo ops 9474535 (-3.42%) >> vm bogo ops 17093608 (the same) >> >> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P >> Average result on 4.11-rc8: 8.71s >> Average result on 4.11-rc8+stackleak: 9.08s (+4.29%) >> >> Test #3: building the Linux kernel with Ubuntu config (time make -j9) >> Result on 4.11-rc8: >> real 32m14.893s >> user 237m30.886s >> sys 11m12.273s >> Result on 4.11-rc8+stackleak: >> real 32m26.881s (+0.62%) >> user 238m38.926s (+0.48%) >> sys 11m36.426s (+3.59%) > > Awesome, and thanks for the benchmarks! That should really help people > understand the trade-offs for using this feature (and is likely worth > mentioning in the Kconfig). Seems like less than 4% overhead, maybe > much less? Real time on build times seems like a tiny difference, but > hackbench shows 4%. Yes, the performance penalty of STACKLEAK differs a lot depending on the kind of load. Do you have any idea which test can give a bigger slowdown? It should be some rapid syscall exhausting the kernel stack hard. >> There is an important aspect of STACKLEAK on i386. PaX Team might know >> about it, I can't say for sure. Dear Grsecurity, is it fine to talk with >> you about such things in that mailing list? Should I do it differently? >> >> The STACKLEAK gcc plugin does not instrument the kernel code for i386. I've >> checked that for the last public patch of Grsecurity/PaX and see the same >> behaviour on my machines. The reason: the ix86_cmodel for the Linux kernel >> on i386 is not CM_KERNEL. So the STACKLEAK plugin seems to skip the >> instrumentation for that platform. As a result, on i386 erase_kstack() >> always starts to search for the bottom of the stack from the top minus 128. > > What version of gcc did you use for these builds, btw? Perhaps > something changed there? I tried to compile the Linux kernel for i386 with two different versions of gcc: #gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 6.3.0-18ubuntu2~16.04' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 6.3.0 20170519 (Ubuntu/Linaro 6.3.0-18ubuntu2~16.04) gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/5/lto-wrapper Target: i686-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-i386/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-i386 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-i386 --with-arch-directory=i386 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-targets=all --enable-multiarch --disable-werror --with-arch-32=i686 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu --target=i686-linux-gnu Thread model: posix gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) Both of them compiled it with ix86_cmodel=0, so the STACKLEAK instrumentation was skipped. >> See the results of the same performance tests on i386: >> >> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s >> --metrics-brief >> Result on 4.11-rc8: >> cpu bogo ops 207754 >> iosync bogo ops 9442815 >> vm bogo ops 8546804 >> Result on 4.11-rc8+stackleak: >> cpu bogo ops 206061 (-0.81%) >> iosync bogo ops 9435139 (-0.08%) >> vm bogo ops 8546804 (the same) >> >> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P >> Average result on 4.11-rc8: 8.197s >> Average result on 4.11-rc8+stackleak: 9.134s (+11.43%) >> >> Test #3: building the Linux kernel with Ubuntu config (time make -j9) >> Result on 4.11-rc8: >> real 18m15.372s >> user 129m58.169s >> sys 8m27.884s >> Result on 4.11-rc8+stackleak: >> real 18m34.244s (+1.72%) >> user 132m33.843s (+2.00%) >> sys 8m37.658s (+1.92%) >> >> More things to be done: >> - understand how the STACKLEAK gcc plugin works; >> - develop tests for STACKLEAK. > > Since this is mostly an anti-exposure defense, LKDTM is probably not a > good match (i.e. organizing a test for the uninit variable case can be > very fragile). I think something similar to test_user_copy.c would be > better. Ok, thanks for the clue. >> From d22af45233b2f6d657a29dcb1815b35a5a45c539 Mon Sep 17 00:00:00 2001 >> From: Alexander Popov <alex.popov@...ux.com> >> Date: Fri, 9 Jun 2017 15:21:16 +0300 >> Subject: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the >> kernel stack at the end of syscalls >> >> The stackleak feature erases the kernel stack before returning from syscalls. >> That reduces the information which a kernel stack leak bug can reveal. >> >> This feature consists of: >> - the architecture-specific code filling the used part of the kernel stack >> with a poison value before returning to the userspace (currently only >> for i386 and x86_64); >> - the gcc plugin for tracking the lowest border of the kernel stack. It >> instruments the kernel code inserting the track_stack() function call if >> a stack frame size is over a specified limit. >> >> The stackleak feature is ported from grsecurity/PaX. For more information see: >> https://grsecurity.net/ >> https://pax.grsecurity.net/ >> >> This code is verbatim from Brad Spengler/PaX Team's code in the last public > > Maybe "nearly verbatim" since Kconfigs and such were (correctly) > renamed/expanded, and comments were added. Agree, I'll fix it. >> patch of grsecurity/PaX based on our understanding of the code. Changes or >> omissions from the original code are ours and don't reflect the original >> grsecurity/PaX code. >> >> Signed-off-by: Alexander Popov <alex.popov@...ux.com> >> Signed-off-by: Tycho Andersen <tycho@...ker.com> > > This S-o-b is not right, I think it should just be yours. Perhaps say > something like "Continued upstreaming work started by Tycho Anderson." > in the commit log. Ok, excuse me. Tycho looked at this patch and tested it. I had to use "Tested-by". >> --- >> arch/Kconfig | 20 ++ >> arch/x86/Kconfig | 1 + >> arch/x86/entry/common.c | 17 +- >> arch/x86/entry/entry_32.S | 71 +++++++ >> arch/x86/entry/entry_64.S | 99 ++++++++++ >> arch/x86/entry/entry_64_compat.S | 8 + >> arch/x86/include/asm/processor.h | 4 + >> arch/x86/kernel/asm-offsets.c | 9 + >> arch/x86/kernel/dumpstack_32.c | 12 ++ >> arch/x86/kernel/dumpstack_64.c | 33 ++++ >> arch/x86/kernel/process_32.c | 5 + >> arch/x86/kernel/process_64.c | 5 + >> fs/exec.c | 17 ++ >> scripts/Makefile.gcc-plugins | 3 + >> scripts/gcc-plugins/stackleak_plugin.c | 342 +++++++++++++++++++++++++++++++++ >> 15 files changed, 643 insertions(+), 3 deletions(-) >> create mode 100644 scripts/gcc-plugins/stackleak_plugin.c >> >> diff --git a/arch/Kconfig b/arch/Kconfig >> index cd211a1..a209bd5 100644 >> --- a/arch/Kconfig >> +++ b/arch/Kconfig >> @@ -436,6 +436,26 @@ config GCC_PLUGIN_STRUCTLEAK_VERBOSE >> initialized. Since not all existing initializers are detected >> by the plugin, this can produce false positive warnings. >> >> +config ARCH_HAS_STACKLEAK >> + def_bool n > > This can just be "bool" (it will default off). Ok. >> + help >> + An architecture should select this if it has the code which >> + fills the used part of the kernel stack with a poison value >> + before returning from system calls. > > Maybe specifically mention the -0xBEEF value? Ok. Should I create some macro for it? >> + >> +config STACKLEAK > > I would follow the naming of the others, and call this GCC_PLUGIN_STACKLEAK It seems to me that GCC_PLUGIN_STACKLEAK is not a right name since the whole feature consists of two parts: the arch-specific asm code actually cleaning the kernel stack and the gcc plugin which helps to do it faster and more reliable. What do you think? >> + bool "Erase the kernel stack before returning from syscalls" >> + depends on GCC_PLUGINS >> + depends on ARCH_HAS_STACKLEAK >> + help >> + This option makes the kernel erase the kernel stack before it >> + returns from a system call. That reduces the information which >> + a kernel stack leak bug can reveal. >> + >> + This plugin was ported from grsecurity/PaX. More information at: >> + * https://grsecurity.net/ >> + * https://pax.grsecurity.net/ > > Is Kconfig smart enough to include this in the gcc plugins sub-page > when the ARCH_HAS_STACKLEAK config is between this and the others? You are right, it's outside gcc plugins subpage. If we finally choose the name GCC_PLUGIN_STACKLEAK, I'll put it inside. >> + >> config HAVE_CC_STACKPROTECTOR >> bool >> help >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >> index cc98d5a..5988a5f 100644 >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -56,6 +56,7 @@ config X86 >> select ARCH_HAS_PMEM_API if X86_64 >> select ARCH_HAS_SET_MEMORY >> select ARCH_HAS_SG_CHAIN >> + select ARCH_HAS_STACKLEAK >> select ARCH_HAS_STRICT_KERNEL_RWX >> select ARCH_HAS_STRICT_MODULE_RWX >> select ARCH_HAS_UBSAN_SANITIZE_ALL >> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c >> index 370c42c..8ff0a84 100644 >> --- a/arch/x86/entry/common.c >> +++ b/arch/x86/entry/common.c >> @@ -43,6 +43,12 @@ __visible inline void enter_from_user_mode(void) >> static inline void enter_from_user_mode(void) {} >> #endif >> >> +#ifdef CONFIG_STACKLEAK >> +asmlinkage void erase_kstack(void); >> +#else >> +static void erase_kstack(void) {} >> +#endif >> + >> static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch) >> { >> #ifdef CONFIG_X86_64 >> @@ -79,8 +85,10 @@ static long syscall_trace_enter(struct pt_regs *regs) >> emulated = true; >> >> if ((emulated || (work & _TIF_SYSCALL_TRACE)) && >> - tracehook_report_syscall_entry(regs)) >> + tracehook_report_syscall_entry(regs)) { >> + erase_kstack(); >> return -1L; >> + } >> >> if (emulated) >> return -1L; >> @@ -114,9 +122,11 @@ static long syscall_trace_enter(struct pt_regs *regs) >> sd.args[5] = regs->bp; >> } >> >> - ret = __secure_computing(&sd); >> - if (ret == -1) >> + ret = secure_computing(&sd); >> + if (ret == -1) { >> + erase_kstack(); >> return ret; >> + } >> } >> #endif > > Since this is x86-specific, maybe Cc x86@...nel.org and/or Andy > Lutomirski on follow-up versions. I'm sure they'll have opinions about > changes to the entry code. :) Thanks! Of course, I'll do it for the next version. > It seems like it shouldn't be too hard to add on-user-return erasure > code to other architectures too. Yes, maybe. At the same time the gcc plugin might somehow depend on x86. I'll investigate that. Best regards, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.