|
Message-ID: <20190726163142.GA6757@pi3.com.pl> Date: Fri, 26 Jul 2019 18:31:42 +0200 From: Adam Zabrocki <pi3@....com.pl> To: lkrg-users@...ts.openwall.com Subject: Re: LKRG 0.7 CI & ED bypass Hi, I was managed to fix the PoC and make a repro. Original PoC is generating a fatal exception (on my VMs) most likely because of the #PF during user-mode page reference. Since int3 instruction generates kprobe exception we have #PF in int3 and have fatal exception. Nevertheless, I was managed to fix the PoC that #PF is not generated at all and then I repro entire scenario. Moreover I've improved PoC in a various ways that it works on a SMEP machines as well. However, this PoC does not leave machine in a stable state and has some limitations: - if SMEP is enabled, it works around 60%-70% of time (at least on my various test machines). LKRG has a chance to detect it, or to generate other type of crashes. 60%-70% numbers might be different, depends on the environment so I would not make strong assumption on that. However, it is not stable to work all the time. - 'text_mutex' is never released (to block CI) and machine is very slow: a. All of my machines are stuck wih 99.9+ CPU usage, e.g. %Cpu(s): 0.0 us,100.0 sys b. Some of my machine are spitting OOM - depends how overloaded machine is c. You can't unload any kernel module d. If you try to load any kernel module, machine will freeze e. None of the kernel functionality which relies on that lock will work, e.g. tracing, perf, etc. - Kernel is trying to restore from the 'bad state' and trying to kill 'stuck' threads. You are spammed in the logs with e.g.: Jul 25 12:10:47 pi3-ubuntu kernel: INFO: task kworker/u480:1:47 blocked for more than 120 seconds. Jul 25 12:10:47 pi3-ubuntu kernel: Tainted: G OE 4.8.0-53-generic #56~16.04.1-Ubuntu Jul 25 12:10:47 pi3-ubuntu kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 25 12:10:47 pi3-ubuntu kernel: kworker/u480:1 D ffff8a2dff777cf8 0 47 2 0x00000000 Jul 25 12:10:47 pi3-ubuntu kernel: Workqueue: events_unbound p_check_integrity [p_lkrg] Jul 25 12:10:47 pi3-ubuntu kernel: ffff8a2dff777cf8 ffff8a2dff4d56c0 ffffffff8d60d500 ffff8a2dff4d4c40 Jul 25 12:10:47 pi3-ubuntu kernel: 0000000000000286 ffff8a2dff778000 ffffffff8d649da4 ffff8a2dff4d4c40 Jul 25 12:10:47 pi3-ubuntu kernel: 00000000ffffffff ffffffff8d649da8 ffff8a2dff777d10 ffffffff8d096045 Jul 25 12:10:47 pi3-ubuntu kernel: Call Trace: Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8d096045>] schedule+0x35/0x80 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8d0962ee>] schedule_preempt_disabled+0xe/0x10 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8d097f49>] __mutex_lock_slowpath+0xb9/0x130 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8d097fdf>] mutex_lock+0x1f/0x30 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffffc06d9c52>] p_check_integrity+0xe2/0x1360 [p_lkrg] Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8c89d89b>] process_one_work+0x16b/0x4a0 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8c89dc1b>] worker_thread+0x4b/0x500 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8c89dbd0>] ? process_one_work+0x4a0/0x4a0 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8c89dbd0>] ? process_one_work+0x4a0/0x4a0 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8c8a3fb8>] kthread+0xd8/0xf0 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8d09aa9f>] ret_from_fork+0x1f/0x40 Jul 25 12:10:47 pi3-ubuntu kernel: [<ffffffff8c8a3ee0>] ? kthread_create_on_node+0x1e0/0x1e0 a. Depends on the kernel configuration, it might happen more or less often. You can configure machine to not generate that messages. b. Machine can also be configured to invoke panic() if task is being 'stuck' / hung like in that situation. It is controled by "/proc/sys/kernel/hung_task_panic" interface. Some distros do enable panic on hung by default. - If you do not restore mutexes to the valid state, you machine will finally crash (it's is on the slow DoS path), you can also see it in the process logs (a lot of tasks): 2176 root 20 0 0 0 0 R 8.6 0.0 2:29.08 kworker/u480:5 2185 root 20 0 0 0 0 R 8.6 0.0 1:06.75 kworker/u480:11 6 root 20 0 0 0 0 R 8.3 0.0 2:46.26 kworker/u480:0 2178 root 20 0 0 0 0 R 8.3 0.0 2:16.42 kworker/u480:6 2182 root 20 0 0 0 0 R 8.3 0.0 1:38.66 kworker/u480:8 2190 root 20 0 0 0 0 R 8.3 0.0 0:54.86 kworker/u480:15 2200 root 20 0 0 0 0 R 8.3 0.0 0:46.68 kworker/u480:25 2207 root 20 0 0 0 0 R 8.3 0.0 0:36.62 kworker/u480:27 2212 root 20 0 0 0 0 R 8.3 0.0 0:17.43 kworker/u480:32 2213 root 20 0 0 0 0 R 8.3 0.0 0:27.97 kworker/u480:33 ... ... 2221 root 20 0 0 0 0 R 7.0 0.0 0:14.28 kworker/u480:41 2233 root 20 0 0 0 0 R 7.0 0.0 0:10.17 kworker/u480:43 We were aware about possibility of attacking synchronization mechanism at it is documented (e.g. here https://www.openwall.com/presentations/CONFidence2018-LKRG-Under-The-Hood/slide-39.html). How machine reacts on that type of attack, matches what I've seen during first LKRG developement. LKRG's CI should verify SMEP / WP CPU bits, but currently it does not do it. It is wrong, so I've prepared a simple patch which verifies critical CPU bits, on every CPU-core, whenever CI is invoked and before any mutex/spinlock is taken: https://bitbucket.org/Adam_pi3/lkrg-main/commits/13a9b5c3a93549b5f0ac1f8317ced3baefbfa501 This patch always stops the current PoC (on machines with SMEP). As a workaround you can also enable /proc/sys/kernel/hung_task_panic and tune timeout value. Thanks, Adam On Thu, Jul 25, 2019 at 03:25:37PM +0400, Ilya Matveychikov wrote: > > > > On Jul 22, 2019, at 11:40 PM, Adam Zabrocki <pi3@....com.pl> wrote: > > > >> CI timer is a periodic job with 15 seconds period by default so I don???t see the reason why > >> it isn???t possible to launch the exploit when CI is not yet started. Lucky you, but it works > >> well on my VM :-) > > > > CI is not only triggered on timer. I've made a test where I've completely > > disabled timer, and still LKRG's CI was able to catch that. Mostly, because > > LKRG's CI can also be executed on the random events in the system which are > > generated by the nature of the bug. > > > > Nevertheless, I've tried to reproduce your environment by disabling SMEP, > > disabling CI timer and also disabling CI on random events in the system. I > > still was not able to reproduce your bypass instead I'm getting critical kernel > > panic (usually fatal exception in interrupt). Can you share a screenshot from > > your tests where LKRG is running? > > Here is a demo: > https://mega.nz/#!g6gnzK4B!5VEgZA3JgnZeCwmjkhJcyf45RTDWM_yOcgW6WAqAUa8 > > > > > Thanks, > > Adam > > > > -- > > pi3 (pi3ki31ny) - pi3 (at) itsec pl > > http://pi3.com.pl > > > -- pi3 (pi3ki31ny) - pi3 (at) itsec pl http://pi3.com.pl
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.