|
Message-ID: <ZpJ4imxZbVpeHijv@remnant.pseudorandom.co.uk> Date: Sat, 13 Jul 2024 13:52:26 +0100 From: Simon McVittie <smcv@...ian.org> To: oss-security@...ts.openwall.com Subject: Re: backtrace_symbols() misuse by Ceph and its supposedly-safe use On Fri, 12 Jul 2024 at 17:37:59 +0800, Alexander Patrakov wrote: > Ceph daemons, however, have a signal handler that catches SIGABRT and > SIGSEGV and tries to format and log a backtrace. ... > What would be a good solution (as in: something that does not convert > crashes into deadlocks) here? I understand that, after memory > corruption, we are already in the UB territory, but is there anything > better possible than what is implemented? Let it crash, and have a kernel core-dump collection hook collect it and do post-mortem analysis? systemd-coredump and corekeeper are the implementations of this that I've used myself, but I'm sure there are plenty more available. This has the additional benefit that it works for every daemon your system might be relying on, not just Ceph itself (I don't know how self-contained Ceph is). The other way to do this is to go to heroic efforts to avoid heap allocations, like Google's Breakpad does: https://chromium.googlesource.com/breakpad/breakpad/+/HEAD/docs/client_design.md#exception-basics This is necessary because Breakpad is typically used by leaf applications (Chrome, games, etc.) that want to be able to report crashes to their vendor, independent of how the underlying OS is set up. Of course, by the time you're in UB territory, literally anything could be happening (for example memory corruption could conceivably have overwritten the stack of Breakpad's crash-handler thread, if you're spectacularly unlucky) but this is more about "pragmatic compromises that usually work" than being 100% correct. But if you control the machine at OS level (as you typically would for a server) it seems more reliable to let the daemon crash and dump core, and let a trusted OS-level component that is not already in an undefined state process the core dump. This seems like it applies extra-strongly if you suspect that the crash might be caused by a malicious actor who is manipulating the memory corruption to their benefit, rather than an accident. smcv
Powered by blists - more mailing lists
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.