|
Message-ID: <CAKv+Gu-5ERHvqgXHR626QzKCZJPc0Cp5Ktv+gwucep5ZGvu9Vw@mail.gmail.com> Date: Mon, 6 Aug 2018 21:54:32 +0200 From: Ard Biesheuvel <ard.biesheuvel@...aro.org> To: Kees Cook <keescook@...omium.org> Cc: Robin Murphy <robin.murphy@....com>, Kernel Hardening <kernel-hardening@...ts.openwall.com>, Mark Rutland <mark.rutland@....com>, Catalin Marinas <catalin.marinas@....com>, Will Deacon <will.deacon@....com>, Christoffer Dall <christoffer.dall@....com>, linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>, Laura Abbott <labbott@...oraproject.org>, Julien Thierry <julien.thierry@....com> Subject: Re: [RFC/PoC PATCH 0/3] arm64: basic ROP mitigation On 6 August 2018 at 21:50, Kees Cook <keescook@...omium.org> wrote: > On Mon, Aug 6, 2018 at 12:35 PM, Ard Biesheuvel > <ard.biesheuvel@...aro.org> wrote: >> On 6 August 2018 at 20:49, Kees Cook <keescook@...omium.org> wrote: >>> On Mon, Aug 6, 2018 at 10:45 AM, Robin Murphy <robin.murphy@....com> wrote: >>>> I guess what I'm getting at is that if the protection mechanism is "always >>>> return with SP outside TTBR1", there seems little point in going through the >>>> motions if SP in TTBR0 could still be valid and allow an attack to succeed >>>> anyway; this is basically just me working through a justification for saying >>>> the proposed scheme needs "depends on ARM64_PAN || ARM64_SW_TTBR0_PAN", >>>> making it that much uglier for v8.0 CPUs... >>> >>> I think anyone with v8.0 CPUs interested in this mitigation would also >>> very much want PAN emulation. If a "depends on" isn't desired, what >>> about "imply" in the Kconfig? >>> >> >> Yes, but actually, using bit #0 is maybe a better alternative in any >> case. You can never dereference SP with bit #0 set, regardless of >> whether the address points to user or kernel space, and my concern >> about reloading sp from x29 doesn't really make sense, given that x29 >> is always assigned from sp right after pushing x29 and x30 in the >> function prologue, and sp only gets restored from x29 in the epilogue >> when there is a stack frame to begin with, in which case we add #1 to >> sp again before returning from the function. > > Fair enough! :) > >> The other code gets a lot cleaner as well. >> >> So for the return we'll have >> >> ldp x29, x30, [sp], #nn >>>>add sp, sp, #0x1 >> ret >> >> and for the function call >> >> bl <foo> >>>>mov x30, sp >>>>bic sp, x30, #1 >> >> The restore sequence in entry.s:96 (which has no spare registers) gets >> much simpler as well: >> >> --- a/arch/arm64/kernel/entry.S >> +++ b/arch/arm64/kernel/entry.S >> @@ -95,6 +95,15 @@ alternative_else_nop_endif >> */ >> add sp, sp, x0 // sp' = sp + x0 >> sub x0, sp, x0 // x0' = sp' - x0 = (sp + x0) - x0 = sp >> +#ifdef CONFIG_ARM64_ROP_SHIELD >> + tbnz x0, #0, 1f >> + .subsection 1 >> +1: sub x0, x0, #1 >> + sub sp, sp, #1 >> + b 2f >> + .previous >> +2: >> +#endif >> tbnz x0, #THREAD_SHIFT, 0f >> sub x0, sp, x0 // x0'' = sp' - x0' = (sp + x0) - sp = x0 >> sub sp, sp, x0 // sp'' = sp' - x0 = (sp + x0) - x0 = sp > > I get slightly concerned about "add" vs "clear bit", but I don't see a > real way to chain a lot of "add"s to get to avoid the unaligned > access. Is "or" less efficient than "add"? > Yes. The stack pointer is special on arm64, and can only be used with a limited set of ALU instructions. So orring #1 would involve 'mov <reg>, sp ; orr sp, <reg>, #1' like in the 'bic' case above, which requires a scratch register as well.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.