musl - Use of -fexceptions and __attribute__((cleanup)) in a multithreaded application

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e3f58a747d287ed8ef1d179e1bfbed386e9813b.camel@suse.com>
Date: Fri, 26 Jun 2026 11:23:21 +0200
From: Martin Wilck <mwilck@...e.com>
To: musl@...ts.openwall.com
Cc: Benjamin Marzinski <bmarzins@...hat.com>, Martin Wilck
	 <martin.wilck@...e.com>
Subject: Use of -fexceptions and __attribute__((cleanup)) in a multithreaded
 application

Hello musl experts,

(please cc me on replies, I'm not subscribed to the musl mailing list)

we have encountered a problem in the unit tests of the multipath-tools
package which occurs only with musl libc.

TL;DR: it appears that a cleanup routine declared with
__attribute__((cleanup)), using the compiler's -fexceptions switch, is
not reliably called before exiting a thread.

Long story:

multipath-tools includes library routines for so called "runners" which
create threads that call a one-shot function. By design, it is expected
that this function may hang forever in uninterruptible sleep, as this
can easily occur when you test IO paths for availabilty (which is the
purpose of these threads). If the one-shot function hangs for a
configurable amount of time, the threads are cancelled and discarded.
Detached threads are used, because threads may hang indefinitely, and
having to join them would prevent multipathd from shutting down
properly.

multipath-tools includes a unit test for the "runner" code. The worker
function in this test simply sleeps while ignoring cancellation
signals, simulating a hanging IO. One of the tests focusses on
maximizing the likelyhood of completion/cancellation races by sleeping
for the same amount of time that's configured as timeout for the
threads. Like in a realistic server with lots of SCSI devices, the
number of parallel runners is in the order 100-1000.

You can see the code, in the form that was causing issues with musl, in
[1]. The code uses __attribute__((cleanup)) for entering a cleanup
routine when the thread terminates. The cleanup function's main purpose
is to update the thread status and to decrement the refcount of the
runner_context resource, possibly freeing it. The idea was to execute
this code always, both in the success case (worker function completes
before timeout) and when the thread was cancelled before the worker
function returned. Therefore the code is compiled with the compiler
flag "-fexceptions". The assumption is that the cancellation signal is
treated as an "exception", and that thus the cleanup code is executed
both when the thread is cancelled and when it completes successfully.

This works with GNU libc, at least we have never seen any issues with
it on many different distributions and architectures we tested this
code on. 

But with musl libc, we encountered highly sporadic segmentation
violations. The stack always indicated an invalid memory access to a
pthread_t object in pthread_cancel->pthread_kill(). Obviously the
thread had already exited when we attempted to cancel it.
pthread_cancel() is called from our function cancel_runner() [2]. As
you can see, we use uatomic_cmpxchg() logic to make sure that the
thread does not exit before pthread_cancel() is called. cancel_runner()
will only cancel the thread when it sees it in RUNNER_RUNNING state. In
this case the thread, when it enters cleanup_context, will see the
state RUNNER_CANCELLED, and will wait for the actual cancellation to
happen.

The only explanation I have for this behavior is that sometimes (not
always, otherwise we would have seen this crash more often, IMO) a
thread exits without going through the cleanup_context() code.

If the error occurs, strace shows something like this:

152235 1782461172.637696 tkill(152630, SIGRT_1 <unfinished ...>
152627 1782461172.637714 +++ exited with 0 +++
152235 1782461172.637733 <... tkill resumed>) = 0
152630 1782461172.637744 --- SIGRT_1 {si_signo=SIGRT_1, si_code=SI_TKILL, si_pid=152235, si_uid=0} ---
152624 1782461172.637820 exit(0)        = ?
152624 1782461172.637861 +++ exited with 0 +++
152629 1782461172.637942 exit(0 <unfinished ...>
152235 1782461172.637951 tkill(152632, SIGRT_1 <unfinished ...>
152629 1782461172.637972 <... exit resumed>) = ?
152235 1782461172.637984 <... tkill resumed>) = 0
152629 1782461172.638010 +++ exited with 0 +++
152632 1782461172.638019 --- SIGRT_1 {si_signo=SIGRT_1, si_code=SI_TKILL, si_pid=152235, si_uid=0} ---
152631 1782461172.638104 exit(0)        = ?
152631 1782461172.638138 +++ exited with 0 +++
152630 1782461172.638210 exit(0)        = ?
152630 1782461172.638242 +++ exited with 0 +++
152633 1782461172.638269 exit(0)        = ?
152633 1782461172.638291 +++ exited with 0 +++
152632 1782461172.638382 exit(0)        = ?
152632 1782461172.638432 +++ exited with 0 +++
152235 1782461172.638468 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x3ffb6073000} ---
152634 1782461172.638480 exit(0)        = ?
152634 1782461172.638493 +++ exited with 0 +++
152635 1782461172.638922 +++ killed by SIGSEGV (core dumped) +++
152235 1782461172.639030 +++ killed by SIGSEGV (core dumped) +++

The main thread (152635) cancels one thread after the other, but then
apparently tries to cancel one of the threads that have already exited.

Following a suggestion of Ben Marzinski (on cc), I changed the
__attribute__((cleanup)) to a pthread_cleanup_push() /
pthread_cleanup_pop() pair [3], and the problem on musl is gone. The
multipath-tools code uses pthread_cleanup_push() quite a lot, but
relatively recently we've started using __attribute__((cleanup)) with 
-fexceptions instead, which allows writing more compact and easily
readable code and avoids some awkwardness with pthreads in some old
library implementations. It appears that this doesn't work reliably
with musl, though.

I am aware that our relying on -fexceptions is not guaranteed behavior
in C [4]. Therefore, I am not sure if this is actually a bug. Anyway,
musl behaves differently from GNU libc in this area, and this
difference has lead to a subtle and hard-to-debug failure, so I want to
make you aware of it, and I'd like to know your opinion on the matter.

For testing, I was using containers with current Alpine Linux with
musl-1.2.5-r23 and gcc 15.2.0-r2. In our GitHub CI, these containers
run on Ubuntu 24.04. In my own environment, they run on various
flavours of SUSE and openSUSE.

My steps to reproduce the problem:

git clone https://github.com/openSUSE/multipath-tools

# The "tip" branch is the current "release candidate" code, which 
# contains the fix using pthread_cleanup_push().
# The "musl" branch has this fix reverted to demonstrate the problem.
git switch -c musl origin/musl

# multipath-build-alpine is simply an Alpine container with the build 
# dependencies of multipath-tools preinstalled
docker run -it --rm -v "$PWD":/build --entrypoint=/bin/sh \
    ghcr.io/mwilck/multipath-build-alpine:latest

# In the container:
make test-progs
cd tests

# tests/runner-test is the unit test program for the "runner" code.
# The sources are in tests/runner.c
# runner-test.sh is a driver script that runs the test with various
# parameter combinations.
./runner-test.sh

# To run the failing test more specifically:
export LD_LIBRARY_PATH=.:../libmpathutil:../libmpathcmd
./runner-test -N 100 -p 1 -t 3000 -n 0 -s 1 -i -r 1000 

The repeat count (-r option) of "runner-test" is set to a large number
here. I typically encounter the issue after a few 100 iterations.

The options mean:

 -N 100: 100 runner threads are started in parallel
 -p 1: main thread pools for timeouts every millisecond
 -t 3000: timeout (and sleep interval) is 3s
 -n 0: zero "noise" modifying the time the threads will sleep,
       iow all threads will sleep for 3000ms
 -i: pthread_cancel() will be ignored while the thread worker function
     is executed
 -r 1000: repeat this test 1000 times.

Regards and thanks
Martin

[1] https://github.com/openSUSE/multipath-tools/blob/798a64aace9408542549f40c0178ec3d6fb95427/libmpathutil/runner.c#L85
[2] https://github.com/openSUSE/multipath-tools/blob/798a64aace9408542549f40c0178ec3d6fb95427/libmpathutil/runner.c#L108
[3] https://github.com/openSUSE/multipath-tools/blob/d20bbb60030fd2c4d01ba372580c673212cc15a5/libmpathutil/runner.c#L87
[4] https://stackoverflow.com/questions/55818317/is-relying-on-gccs-llvms-fexceptions-technically-undefined-behavior
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.