|
Message-ID: <CAKv+Gu-FvfPFQooCie6HwP=mBng3C0jp9p8WMkFwTxctDu4JBA@mail.gmail.com> Date: Fri, 14 Jul 2017 11:48:20 +0100 From: Ard Biesheuvel <ard.biesheuvel@...aro.org> To: Mark Rutland <mark.rutland@....com> Cc: Kernel Hardening <kernel-hardening@...ts.openwall.com>, "linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Takahiro Akashi <akashi.takahiro@...aro.org>, Catalin Marinas <catalin.marinas@....com>, Dave Martin <dave.martin@....com>, James Morse <james.morse@....com>, Laura Abbott <labbott@...oraproject.org>, Will Deacon <will.deacon@....com>, Kees Cook <keescook@...omium.org> Subject: Re: Re: [RFC PATCH 6/6] arm64: add VMAP_STACK and detect out-of-bounds SP On 14 July 2017 at 11:32, Mark Rutland <mark.rutland@....com> wrote: > On Thu, Jul 13, 2017 at 07:28:48PM +0100, Ard Biesheuvel wrote: >> On 13 July 2017 at 18:55, Mark Rutland <mark.rutland@....com> wrote: >> > On Thu, Jul 13, 2017 at 05:10:50PM +0100, Mark Rutland wrote: >> >> On Thu, Jul 13, 2017 at 12:49:48PM +0100, Ard Biesheuvel wrote: >> >> > On 13 July 2017 at 11:49, Mark Rutland <mark.rutland@....com> wrote: >> >> > > On Thu, Jul 13, 2017 at 07:58:50AM +0100, Ard Biesheuvel wrote: >> >> > >> On 12 July 2017 at 23:33, Mark Rutland <mark.rutland@....com> wrote: >> > >> >> > Given that the very first stp in kernel_entry will fault if we have >> >> > less than S_FRAME_SIZE bytes of stack left, I think we should check >> >> > that we have at least that much space available. >> >> >> >> I was going to reply saying that I didn't agree, but in writing up >> >> examples, I mostly convinced myself that this is the right thing to do. >> >> So I mostly agree! >> >> >> >> This would mean we treat the first impossible-to-handle exception as >> >> that fatal case, which is similar to x86's double-fault, triggered when >> >> the HW can't stack the regs. All other cases are just arbitrary faults. >> >> >> >> However, to provide that consistently, we'll need to perform this check >> >> at every exception boundary, or some of those cases will result in a >> >> recursive fault first. >> >> >> >> So I think there are three choices: >> >> >> >> 1) In el1_sync, only check SP bounds, and live with the recursive >> >> faults. >> >> >> >> 2) in el1_sync, check there's room for the regs, and live with the >> >> recursive faults for overflow on other exceptions. >> >> >> >> 3) In all EL1 entry paths, check there's room for the regs. >> > >> > FWIW, for the moment I've applied (2), as you suggested, to my >> > arm64/vmap-stack branch, adding an additional: >> > >> > sub x0, x0, #S_FRAME_SIZE >> > >> > ... to the entry path. >> > >> > I think it's worth trying (3) so that we consistently report these >> > cases, benchmarks permitting. >> > >> >> OK, so here's a crazy idea: what if we >> a) carve out a dedicated range in the VMALLOC area for stacks >> b) for each stack, allocate a naturally aligned window of 2x the stack >> size, and map the stack inside it, leaving the remaining space >> unmapped > > This is not such a crazy idea. :) > > In fact, it was one I toyed with before getting lost on a register > juggling tangent (see below). > >> That way, we can compare SP (minus S_FRAME_SIZE) against a mask that >> is a build time constant, to decide whether its value points into a >> stack or not. Of course, it may be pointing into the wrong stack, but >> that should not prevent us from taking the exception, and we can deal >> with that later. It would give us a very cheap way to perform this >> test on the hot paths. > > The logical ops (TST) and conditional branches (TB(N)Z, CB(N)Z) operate > on XZR rather than SP, so to do this we need to get the SP value into a > GPR. > > Previously, I assumed this meant we needed to corrupt a GPR (and hence > stash that GPR in a sysreg), so I started writing code to free sysregs. > > However, I now realise I was being thick, since we can stash the GPR > in the SP: > > sub sp, sp, x0 // sp = orig_sp - x0 > add x0, sp, x0 // x0 = x0 - (orig_sp - x0) == orig_sp > sub x0, x0, #S_FRAME_SIZE > tb(nz) x0, #THREAD_SHIFT, overflow > add x0, x0, #S_FRAME_SIZE > sub x0, sp, x0 > add sp, sp, x0 > > ... so yes, this could work! > Nice! > This means that we have to align the initial task, so the kernel Image > will grow by THREAD_SIZE. Likewise for IRQ stacks, unless we can rework > things such that we can dynamically allocate all of those. > We can't currently do that for 64k pages, since the segment alignment is only 64k. But we should be able to patch that up I think >> >> I believe that determining whether the exception was caused by a stack >> >> overflow is not something we can do robustly or efficiently. >> >> Actually, if the stack pointer is within S_FRAME_SIZE of the base, and >> the faulting address points into the guard page, that is a pretty >> strong indicator that the stack overflowed. That shouldn't be too >> costly? > > Sure, but that's still a a heuristic. For example, that also catches an > unrelated vmalloc address gone wrong, while SP was close to the end of > the stack. > Yes, but the likelihood that an unrelated stray vmalloc access is within 16 KB of a stack pointer that is close ot its limit is extremely low, so we should be able to live with the risk of misidentifying it. > The important thing is whether we can *safely enter the exception* (i.e. > stack the regs), or whether this'll push the SP (further) out-of-bounds. > I think we agree that we can reliably and efficiently check this. > Yes. > The general case of nominal "stack overflows" (e.g. large preidx > decrements, proxied SP values, unrelated guard-page faults) is a > semantic minefield. I don't think we should add code to try to > distinguish these. > > For that general case, if we can enter the exception then we can try to > handle the exception in the usual way, and either: > > * The fault code determines the access was bad. We at least kill the > thread. > > * We overflow the stack while trying to handle the exception, triggering > a new fault to triage. > > To make it possible to distinguish and debug these, we need to fix the > backtracing code, but that's it. > > Thanks, > Mark.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.