|
Message-ID: <8817DE5F-BCF4-4F6A-A496-E0DB6889D86E@vmware.com> Date: Thu, 17 Jan 2019 22:39:15 +0000 From: Nadav Amit <namit@...are.com> To: "H. Peter Anvin" <hpa@...or.com> CC: Masami Hiramatsu <mhiramat@...nel.org>, Rick Edgecombe <rick.p.edgecombe@...el.com>, Andy Lutomirski <luto@...nel.org>, Ingo Molnar <mingo@...hat.com>, LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, Peter Zijlstra <peterz@...radead.org>, Damian Tometzki <linux_dti@...oud.com>, linux-integrity <linux-integrity@...r.kernel.org>, LSM List <linux-security-module@...r.kernel.org>, Andrew Morton <akpm@...ux-foundation.org>, Kernel Hardening <kernel-hardening@...ts.openwall.com>, Linux-MM <linux-mm@...ck.org>, Will Deacon <will.deacon@....com>, Ard Biesheuvel <ard.biesheuvel@...aro.org>, Kristen Carlson Accardi <kristen@...ux.intel.com>, "Dock, Deneen T" <deneen.t.dock@...el.com>, Kees Cook <keescook@...omium.org>, Dave Hansen <dave.hansen@...el.com> Subject: Re: [PATCH 01/17] Fix "x86/alternatives: Lockdep-enforce text_mutex in text_poke*()" > On Jan 17, 2019, at 1:15 PM, hpa@...or.com wrote: > > On January 16, 2019 10:47:01 PM PST, Masami Hiramatsu <mhiramat@...nel.org> wrote: >> On Wed, 16 Jan 2019 16:32:43 -0800 >> Rick Edgecombe <rick.p.edgecombe@...el.com> wrote: >> >>> From: Nadav Amit <namit@...are.com> >>> >>> text_mutex is currently expected to be held before text_poke() is >>> called, but we kgdb does not take the mutex, and instead *supposedly* >>> ensures the lock is not taken and will not be acquired by any other >> core >>> while text_poke() is running. >>> >>> The reason for the "supposedly" comment is that it is not entirely >> clear >>> that this would be the case if gdb_do_roundup is zero. >>> >>> This patch creates two wrapper functions, text_poke() and >>> text_poke_kgdb() which do or do not run the lockdep assertion >>> respectively. >>> >>> While we are at it, change the return code of text_poke() to >> something >>> meaningful. One day, callers might actually respect it and the >> existing >>> BUG_ON() when patching fails could be removed. For kgdb, the return >>> value can actually be used. >> >> Looks good to me. >> >> Reviewed-by: Masami Hiramatsu <mhiramat@...nel.org> >> >> Thank you, >> >>> Cc: Andy Lutomirski <luto@...nel.org> >>> Cc: Kees Cook <keescook@...omium.org> >>> Cc: Dave Hansen <dave.hansen@...el.com> >>> Cc: Masami Hiramatsu <mhiramat@...nel.org> >>> Fixes: 9222f606506c ("x86/alternatives: Lockdep-enforce text_mutex in >> text_poke*()") >>> Suggested-by: Peter Zijlstra <peterz@...radead.org> >>> Acked-by: Jiri Kosina <jkosina@...e.cz> >>> Signed-off-by: Nadav Amit <namit@...are.com> >>> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@...el.com> >>> --- >>> arch/x86/include/asm/text-patching.h | 1 + >>> arch/x86/kernel/alternative.c | 52 >> ++++++++++++++++++++-------- >>> arch/x86/kernel/kgdb.c | 11 +++--- >>> 3 files changed, 45 insertions(+), 19 deletions(-) >>> >>> diff --git a/arch/x86/include/asm/text-patching.h >> b/arch/x86/include/asm/text-patching.h >>> index e85ff65c43c3..f8fc8e86cf01 100644 >>> --- a/arch/x86/include/asm/text-patching.h >>> +++ b/arch/x86/include/asm/text-patching.h >>> @@ -35,6 +35,7 @@ extern void *text_poke_early(void *addr, const void >> *opcode, size_t len); >>> * inconsistent instruction while you patch. >>> */ >>> extern void *text_poke(void *addr, const void *opcode, size_t len); >>> +extern void *text_poke_kgdb(void *addr, const void *opcode, size_t >> len); >>> extern int poke_int3_handler(struct pt_regs *regs); >>> extern void *text_poke_bp(void *addr, const void *opcode, size_t >> len, void *handler); >>> extern int after_bootmem; >>> diff --git a/arch/x86/kernel/alternative.c >> b/arch/x86/kernel/alternative.c >>> index ebeac487a20c..c6a3a10a2fd5 100644 >>> --- a/arch/x86/kernel/alternative.c >>> +++ b/arch/x86/kernel/alternative.c >>> @@ -678,18 +678,7 @@ void *__init_or_module text_poke_early(void >> *addr, const void *opcode, >>> return addr; >>> } >>> >>> -/** >>> - * text_poke - Update instructions on a live kernel >>> - * @addr: address to modify >>> - * @opcode: source of the copy >>> - * @len: length to copy >>> - * >>> - * Only atomic text poke/set should be allowed when not doing early >> patching. >>> - * It means the size must be writable atomically and the address >> must be aligned >>> - * in a way that permits an atomic write. It also makes sure we fit >> on a single >>> - * page. >>> - */ >>> -void *text_poke(void *addr, const void *opcode, size_t len) >>> +static void *__text_poke(void *addr, const void *opcode, size_t len) >>> { >>> unsigned long flags; >>> char *vaddr; >>> @@ -702,8 +691,6 @@ void *text_poke(void *addr, const void *opcode, >> size_t len) >>> */ >>> BUG_ON(!after_bootmem); >>> >>> - lockdep_assert_held(&text_mutex); >>> - >>> if (!core_kernel_text((unsigned long)addr)) { >>> pages[0] = vmalloc_to_page(addr); >>> pages[1] = vmalloc_to_page(addr + PAGE_SIZE); >>> @@ -732,6 +719,43 @@ void *text_poke(void *addr, const void *opcode, >> size_t len) >>> return addr; >>> } >>> >>> +/** >>> + * text_poke - Update instructions on a live kernel >>> + * @addr: address to modify >>> + * @opcode: source of the copy >>> + * @len: length to copy >>> + * >>> + * Only atomic text poke/set should be allowed when not doing early >> patching. >>> + * It means the size must be writable atomically and the address >> must be aligned >>> + * in a way that permits an atomic write. It also makes sure we fit >> on a single >>> + * page. >>> + */ >>> +void *text_poke(void *addr, const void *opcode, size_t len) >>> +{ >>> + lockdep_assert_held(&text_mutex); >>> + >>> + return __text_poke(addr, opcode, len); >>> +} >>> + >>> +/** >>> + * text_poke_kgdb - Update instructions on a live kernel by kgdb >>> + * @addr: address to modify >>> + * @opcode: source of the copy >>> + * @len: length to copy >>> + * >>> + * Only atomic text poke/set should be allowed when not doing early >> patching. >>> + * It means the size must be writable atomically and the address >> must be aligned >>> + * in a way that permits an atomic write. It also makes sure we fit >> on a single >>> + * page. >>> + * >>> + * Context: should only be used by kgdb, which ensures no other core >> is running, >>> + * despite the fact it does not hold the text_mutex. >>> + */ >>> +void *text_poke_kgdb(void *addr, const void *opcode, size_t len) >>> +{ >>> + return __text_poke(addr, opcode, len); >>> +} >>> + >>> static void do_sync_core(void *info) >>> { >>> sync_core(); >>> diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c >>> index 5db08425063e..1461544cba8b 100644 >>> --- a/arch/x86/kernel/kgdb.c >>> +++ b/arch/x86/kernel/kgdb.c >>> @@ -758,13 +758,13 @@ int kgdb_arch_set_breakpoint(struct kgdb_bkpt >> *bpt) >>> if (!err) >>> return err; >>> /* >>> - * It is safe to call text_poke() because normal kernel execution >>> + * It is safe to call text_poke_kgdb() because normal kernel >> execution >>> * is stopped on all cores, so long as the text_mutex is not >> locked. >>> */ >>> if (mutex_is_locked(&text_mutex)) >>> return -EBUSY; >>> - text_poke((void *)bpt->bpt_addr, arch_kgdb_ops.gdb_bpt_instr, >>> - BREAK_INSTR_SIZE); >>> + text_poke_kgdb((void *)bpt->bpt_addr, arch_kgdb_ops.gdb_bpt_instr, >>> + BREAK_INSTR_SIZE); >>> err = probe_kernel_read(opc, (char *)bpt->bpt_addr, >> BREAK_INSTR_SIZE); >>> if (err) >>> return err; >>> @@ -783,12 +783,13 @@ int kgdb_arch_remove_breakpoint(struct >> kgdb_bkpt *bpt) >>> if (bpt->type != BP_POKE_BREAKPOINT) >>> goto knl_write; >>> /* >>> - * It is safe to call text_poke() because normal kernel execution >>> + * It is safe to call text_poke_kgdb() because normal kernel >> execution >>> * is stopped on all cores, so long as the text_mutex is not >> locked. >>> */ >>> if (mutex_is_locked(&text_mutex)) >>> goto knl_write; >>> - text_poke((void *)bpt->bpt_addr, bpt->saved_instr, >> BREAK_INSTR_SIZE); >>> + text_poke_kgdb((void *)bpt->bpt_addr, bpt->saved_instr, >>> + BREAK_INSTR_SIZE); >>> err = probe_kernel_read(opc, (char *)bpt->bpt_addr, >> BREAK_INSTR_SIZE); >>> if (err || memcmp(opc, bpt->saved_instr, BREAK_INSTR_SIZE)) >>> goto knl_write; >>> -- >>> 2.17.1 > > If you are reorganizing this code, please do so so that the caller doesn’t > have to worry about if it should call text_poke_bp() or text_poke_early(). > Right now the caller had to know that, which makes no sense. Did you look at "[11/17] x86/jump-label: remove support for custom poker”? https://lore.kernel.org/patchwork/patch/1032857/ If this is not what you regard, please be more concrete. text_poke_early() is still used directly on init and while modules are loaded, which might not be great, but is outside of the scope of this patch-set.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.