Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <201906271026.383D4F5@keescook>
Date: Thu, 27 Jun 2019 10:26:55 -0700
From: Kees Cook <keescook@...omium.org>
To: Andy Lutomirski <luto@...nel.org>
Cc: x86@...nel.org, LKML <linux-kernel@...r.kernel.org>,
	Florian Weimer <fweimer@...hat.com>, Jann Horn <jannh@...gle.com>,
	Borislav Petkov <bp@...en8.de>,
	Kernel Hardening <kernel-hardening@...ts.openwall.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 2/8] x86/vsyscall: Add a new vsyscall=xonly mode

On Wed, Jun 26, 2019 at 09:45:03PM -0700, Andy Lutomirski wrote:
> With vsyscall emulation on, we still expose a readable vsyscall page
> that contains syscall instructions that validly implement the
> vsyscalls.  We need this because certain dynamic binary
> instrumentation tools attempt to read the call targets of call
> instructions in the instrumented code.  If the instrumented code
> uses vsyscalls, then the vsyscal page needs to contain readable
> code.
> 
> Unfortunately, leaving readable memory at a deterministic address
> can be used to help various ASLR bypasses, so we gain some hardening
> value if we disallow vsyscall reads.
> 
> Given how rarely the vsyscall page needs to be readable, add a
> mechanism to make the vsyscall page be execute only.
> 
> Cc: Kees Cook <keescook@...omium.org>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Kernel Hardening <kernel-hardening@...ts.openwall.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Andy Lutomirski <luto@...nel.org>

Reviewed-by: Kees Cook <keescook@...omium.org>

-Kees

> ---
>  .../admin-guide/kernel-parameters.txt         |  7 +++-
>  arch/x86/Kconfig                              | 33 ++++++++++++++-----
>  arch/x86/entry/vsyscall/vsyscall_64.c         | 16 +++++++--
>  3 files changed, 44 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 0082d1e56999..be8c3a680afa 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -5100,7 +5100,12 @@
>  			targets for exploits that can control RIP.
>  
>  			emulate     [default] Vsyscalls turn into traps and are
> -			            emulated reasonably safely.
> +			            emulated reasonably safely.  The vsyscall
> +				    page is readable.
> +
> +			xonly       Vsyscalls turn into traps and are
> +			            emulated reasonably safely.  The vsyscall
> +				    page is not readable.
>  
>  			none        Vsyscalls don't work at all.  This makes
>  			            them quite hard to use for exploits but
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 2bbbd4d1ba31..0182d2c67590 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2293,23 +2293,38 @@ choice
>  	  it can be used to assist security vulnerability exploitation.
>  
>  	  This setting can be changed at boot time via the kernel command
> -	  line parameter vsyscall=[emulate|none].
> +	  line parameter vsyscall=[emulate|xonly|none].
>  
>  	  On a system with recent enough glibc (2.14 or newer) and no
>  	  static binaries, you can say None without a performance penalty
>  	  to improve security.
>  
> -	  If unsure, select "Emulate".
> +	  If unsure, select "Emulate execution only".
>  
>  	config LEGACY_VSYSCALL_EMULATE
> -		bool "Emulate"
> +		bool "Full emulation"
>  		help
> -		  The kernel traps and emulates calls into the fixed
> -		  vsyscall address mapping. This makes the mapping
> -		  non-executable, but it still contains known contents,
> -		  which could be used in certain rare security vulnerability
> -		  exploits. This configuration is recommended when userspace
> -		  still uses the vsyscall area.
> +		  The kernel traps and emulates calls into the fixed vsyscall
> +		  address mapping. This makes the mapping non-executable, but
> +		  it still contains readable known contents, which could be
> +		  used in certain rare security vulnerability exploits. This
> +		  configuration is recommended when using legacy userspace
> +		  that still uses vsyscalls along with legacy binary
> +		  instrumentation tools that require code to be readable.
> +
> +		  An example of this type of legacy userspace is running
> +		  Pin on an old binary that still uses vsyscalls.
> +
> +	config LEGACY_VSYSCALL_XONLY
> +		bool "Emulate execution only"
> +		help
> +		  The kernel traps and emulates calls into the fixed vsyscall
> +		  address mapping and does not allow reads.  This
> +		  configuration is recommended when userspace might use the
> +		  legacy vsyscall area but support for legacy binary
> +		  instrumentation of legacy code is not needed.  It mitigates
> +		  certain uses of the vsyscall area as an ASLR-bypassing
> +		  buffer.
>  
>  	config LEGACY_VSYSCALL_NONE
>  		bool "None"
> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
> index d9d81ad7a400..fedd7628f3a6 100644
> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> @@ -42,9 +42,11 @@
>  #define CREATE_TRACE_POINTS
>  #include "vsyscall_trace.h"
>  
> -static enum { EMULATE, NONE } vsyscall_mode =
> +static enum { EMULATE, XONLY, NONE } vsyscall_mode =
>  #ifdef CONFIG_LEGACY_VSYSCALL_NONE
>  	NONE;
> +#elif defined(CONFIG_LEGACY_VSYSCALL_XONLY)
> +	XONLY;
>  #else
>  	EMULATE;
>  #endif
> @@ -54,6 +56,8 @@ static int __init vsyscall_setup(char *str)
>  	if (str) {
>  		if (!strcmp("emulate", str))
>  			vsyscall_mode = EMULATE;
> +		else if (!strcmp("xonly", str))
> +			vsyscall_mode = XONLY;
>  		else if (!strcmp("none", str))
>  			vsyscall_mode = NONE;
>  		else
> @@ -357,12 +361,20 @@ void __init map_vsyscall(void)
>  	extern char __vsyscall_page;
>  	unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
>  
> -	if (vsyscall_mode != NONE) {
> +	/*
> +	 * For full emulation, the page needs to exist for real.  In
> +	 * execute-only mode, there is no PTE at all backing the vsyscall
> +	 * page.
> +	 */
> +	if (vsyscall_mode == EMULATE) {
>  		__set_fixmap(VSYSCALL_PAGE, physaddr_vsyscall,
>  			     PAGE_KERNEL_VVAR);
>  		set_vsyscall_pgtable_user_bits(swapper_pg_dir);
>  	}
>  
> +	if (vsyscall_mode == XONLY)
> +		gate_vma.vm_flags = VM_EXEC;
> +
>  	BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=
>  		     (unsigned long)VSYSCALL_ADDR);
>  }
> -- 
> 2.21.0
> 

-- 
Kees Cook

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.