|
Message-ID: <c48643fa-213d-7391-9f5d-c75efe709c3f@linux.com> Date: Wed, 6 Oct 2021 17:56:55 +0300 From: Alexander Popov <alex.popov@...ux.com> To: "Eric W. Biederman" <ebiederm@...ssion.com> Cc: Linus Torvalds <torvalds@...ux-foundation.org>, Petr Mladek <pmladek@...e.com>, "Paul E. McKenney" <paulmck@...nel.org>, Jonathan Corbet <corbet@....net>, Andrew Morton <akpm@...ux-foundation.org>, Thomas Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>, Joerg Roedel <jroedel@...e.de>, Maciej Rozycki <macro@...am.me.uk>, Muchun Song <songmuchun@...edance.com>, Viresh Kumar <viresh.kumar@...aro.org>, Robin Murphy <robin.murphy@....com>, Randy Dunlap <rdunlap@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>, Kees Cook <keescook@...omium.org>, Luis Chamberlain <mcgrof@...nel.org>, Wei Liu <wl@....org>, John Ogness <john.ogness@...utronix.de>, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, Alexey Kardashevskiy <aik@...abs.ru>, Christophe Leroy <christophe.leroy@...roup.eu>, Jann Horn <jannh@...gle.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Mark Rutland <mark.rutland@....com>, Andy Lutomirski <luto@...nel.org>, Dave Hansen <dave.hansen@...ux.intel.com>, Steven Rostedt <rostedt@...dmis.org>, Will Deacon <will.deacon@....com>, David S Miller <davem@...emloft.net>, Borislav Petkov <bp@...en8.de>, Kernel Hardening <kernel-hardening@...ts.openwall.com>, linux-hardening@...r.kernel.org, "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, notify@...nel.org Subject: Re: [PATCH] Introduce the pkill_on_warn boot parameter On 05.10.2021 22:48, Eric W. Biederman wrote: > Alexander Popov <alex.popov@...ux.com> writes: > >> On 02.10.2021 19:52, Linus Torvalds wrote: >>> On Sat, Oct 2, 2021 at 4:41 AM Alexander Popov <alex.popov@...ux.com> wrote: >>>> >>>> And what do you think about the proposed pkill_on_warn? >>> >>> Honestly, I don't see the point. >>> >>> If you can reliably trigger the WARN_ON some way, you can probably >>> cause more problems by fooling some other process to trigger it. >>> >>> And if it's unintentional, then what does the signal help? >>> >>> So rather than a "rationale" that makes little sense, I'd like to hear >>> of an actual _use_ case. That's different. That's somebody actually >>> _using_ that pkill to good effect for some particular load. >> >> I was thinking about a use case for you and got an insight. >> >> Bugs usually don't come alone. Killing the process that got WARN_ON() prevents >> possible bad effects **after** the warning. For example, in my exploit for >> CVE-2019-18683, the kernel warning happens **before** the memory corruption >> (use-after-free in the V4L2 subsystem). >> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html >> >> So pkill_on_warn allows the kernel to stop the process when the first signs of >> wrong behavior are detected. In other words, proceeding with the code execution >> from the wrong state can bring more disasters later. >> >>> That said, I don't much care in the end. But it sounds like a >>> pointless option to just introduce yet another behavior to something >>> that should never happen anyway, and where the actual >>> honest-to-goodness reason for WARN_ON() existing is already being >>> fulfilled (ie syzbot has been very effective at flushing things like >>> that out). >> >> Yes, we slowly get rid of kernel warnings. >> However, the syzbot dashboard still shows a lot of them. >> Even my small syzkaller setup finds plenty of new warnings. >> I believe fixing all of them will take some time. >> And during that time, pkill_on_warn may be a better reaction to WARN_ON() than >> ignoring and proceeding with the execution. >> >> Is that reasonable? > > I won't comment on the sanity of the feature but I will say that calling > it oops_on_warn (rather than pkill_on_warn), and using the usual oops > facilities rather than rolling oops by hand sounds like a better > implementation. > > Especially as calling do_group_exit(SIGKILL) from a random location is > not a clean way to kill a process. Strictly speaking it is not even > killing the process. > > Partly this is just me seeing the introduction of a > do_group_exit(SIGKILL) call and not likely the maintenance that will be > needed. I am still sorting out the problems with other randomly placed > calls to do_group_exit(SIGKILL) and interactions with ptrace and > PTRACE_EVENT_EXIT in particular. > > Which is a long winded way of saying if I can predictably trigger a > warning that calls do_group_exit(SIGKILL), on some architectures I can > use ptrace and can convert that warning into a way to manipulate the > kernel stack to have the contents of my choice. > > If anyone goes forward with this please use the existing oops > infrastructure so the ptrace interactions and anything else that comes > up only needs to be fixed once. Eric, thanks a lot. I will learn the oops infrastructure deeper. I will do more experiments and come with version 2. Currently, I think I will save the pkill_on_warn option name because I want to avoid kernel crashes. Thanks to everyone who gave feedback on this patch! Best regards, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.