Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <lsq.1520823814.932106069@decadent.org.uk>
Date: Mon, 12 Mar 2018 03:03:34 +0000
From: Ben Hutchings <ben@...adent.org.uk>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org
CC: akpm@...ux-foundation.org,
 "Dan Williams" <dan.j.williams@...el.com>,
 "Linus Torvalds" <torvalds@...ux-foundation.org>,
 "Jiri Slaby" <jslaby@...e.cz>,
 kernel-hardening@...ts.openwall.com, alan@...ux.intel.com,
 "Jan Beulich" <JBeulich@...e.com>,
 linux-arch@...r.kernel.org,
 "Andy Lutomirski" <luto@...nel.org>,
 "Jinpu Wang" <jinpu.wang@...fitbricks.com>,
 gregkh@...uxfoundation.org,
 "Thomas Gleixner" <tglx@...utronix.de>
Subject: [PATCH 3.2 087/104] x86/syscall: Sanitize syscall table
 de-references under speculation

3.2.101-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <ben@...adent.org.uk>

commit 2fbd7af5af8665d18bcefae3e9700be07e22b681 upstream.

The upstream version of this, touching C code, was written by Dan Williams,
with the following description:

> The syscall table base is a user controlled function pointer in kernel
> space. Use array_index_nospec() to prevent any out of bounds speculation.
>
> While retpoline prevents speculating into a userspace directed target it
> does not stop the pointer de-reference, the concern is leaking memory
> relative to the syscall table base, by observing instruction cache
> behavior.

The x86_64 assembly version for 4.4 was written by Jiri Slaby, with
the following description:

> In 4.4.118, we have commit c8961332d6da (x86/syscall: Sanitize syscall
> table de-references under speculation), which is a backport of upstream
> commit 2fbd7af5af86. But it fixed only the C part of the upstream patch
> -- the IA32 sysentry. So it ommitted completely the assembly part -- the
> 64bit sysentry.
>
> Fix that in this patch by explicit array_index_mask_nospec written in
> assembly. The same was used in lib/getuser.S.
>
> However, to have "sbb" working properly, we have to switch from "cmp"
> against (NR_syscalls-1) to (NR_syscalls), otherwise the last syscall
> number would be "and"ed by 0. It is because the original "ja" relies on
> "CF" or "ZF", but we rely only on "CF" in "sbb". That means: switch to
> "jae" conditional jump too.
>
> Final note: use rcx for mask as this is exactly what is overwritten by
> the 4th syscall argument (r10) right after.

In 3.2 the x86_32 syscall table lookup is also written in assembly.
So I've taken Jiri's version and added similar masking in entry_32.S,
using edx as the temporary.  edx is clobbered by SAVE_REGS and seems
to be free at this point.

In 3.2 the x86_64 entry code also lacks syscall masking for x32.

Cc: Dan Williams <dan.j.williams@...el.com>
Cc: Jiri Slaby <jslaby@...e.cz>
Cc: Jan Beulich <JBeulich@...e.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-arch@...r.kernel.org
Cc: kernel-hardening@...ts.openwall.com
Cc: gregkh@...uxfoundation.org
Cc: Andy Lutomirski <luto@...nel.org>
Cc: alan@...ux.intel.com
Cc: Jinpu Wang <jinpu.wang@...fitbricks.com>
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
---
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -429,6 +429,8 @@ sysenter_past_esp:
 sysenter_do_call:
 	cmpl $(nr_syscalls), %eax
 	jae sysenter_badsys
+	sbb %edx, %edx				/* array_index_mask_nospec() */
+	and %edx, %eax
 	call *sys_call_table(,%eax,4)
 sysenter_after_call:
 	movl %eax,PT_EAX(%esp)
@@ -512,6 +514,8 @@ ENTRY(system_call)
 	cmpl $(nr_syscalls), %eax
 	jae syscall_badsys
 syscall_call:
+	sbb %edx, %edx				/* array_index_mask_nospec() */
+	and %edx, %eax
 	call *sys_call_table(,%eax,4)
 syscall_after_call:
 	movl %eax,PT_EAX(%esp)		# store the return value
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -517,8 +517,10 @@ ENTRY(system_call_after_swapgs)
 	testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags(%rcx)
 	jnz tracesys
 system_call_fastpath:
-	cmpq $__NR_syscall_max,%rax
-	ja badsys
+	cmpq	$NR_syscalls, %rax
+	jae	badsys
+	sbb	%rcx, %rcx			/* array_index_mask_nospec() */
+	and	%rcx, %rax
 	movq %r10,%rcx
 #ifdef CONFIG_RETPOLINE
 	movq	sys_call_table(, %rax, 8), %rax
@@ -646,8 +648,10 @@ tracesys:
 	 */
 	LOAD_ARGS ARGOFFSET, 1
 	RESTORE_REST
-	cmpq $__NR_syscall_max,%rax
-	ja   int_ret_from_sys_call	/* RAX(%rsp) set to -ENOSYS above */
+	cmpq	$NR_syscalls, %rax
+	jae	int_ret_from_sys_call		/* RAX(%rsp) set to -ENOSYS above */
+	sbb	%rcx, %rcx			/* array_index_mask_nospec() */
+	and	%rcx, %rax
 	movq %r10,%rcx	/* fixup for C */
 #ifdef CONFIG_RETPOLINE
 	movq	sys_call_table(, %rax, 8), %rax

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.