Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20241004162542.GI10433@brightrain.aerifal.cx>
Date: Fri, 4 Oct 2024 12:25:42 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Spin logic flaw in mutexes, etc.

I realized while reviewing how we opportunistically spin trying to
obtain locks that there's a problem with the logic: it was intended
not to spin if there are already waiters (this is both a fairness
consideration and a wasted-cycles one), but it can only see waiters
that have already enterred the futex-wait phase. Waiters which are
spinning are invisible. This leads to a situation where a large number
of threads arriving at the same time hammering a lock will all spin.

Just incrementing the waiter count before the spin is an obvious
remedy, but by itself that would lead to the unlocking thread
performing a useless and costly futex wake.

I think we could redefine the waiter count to be adjusted by 1, so
that a value of 1 would mean there's a spinning waiter, and N>1 would
mean N-1 waiters potentially in futex wait that require a wake.

With this approach, I think it would work to have the caller remember
that it performed an additional waiters increment that needs to be
reverted, or just do logic for transitioning 2->0 rather than 2->1. I
thought this could be put in __wait, but I believe all the important
callers actually use __timedwait which does not integrate the waiters
increment/decrement logic.

There might be other code paths that exit early where the waiter
accounting needs to be reverted too though. For example I see one on
line 80 of pthread_mutex_timedlock.c, but I think having this check is
erroneous anyway, since the earier call to pthread_mutex_trylock would
necessarily have caught this condition in the absence of
state-corrupting UB.

I'm not sure if we should try to avoid ever returning the waiters
count to 0 between futex waits, which can happen now if there's only
one waiter. Returning the count momentarily to zero allows a newly
arriving waiter to spin (stealing the opportunity to take the lock).
However stealing the lock is always possible before spinning anyway,
anyway due to the trylock, and I don't think any complete mitigation
for this is possible.

The affected functions seem to be:

- pthread_barrier_wait
- pthread_mutex_timedwait
- pthread_rwlock_timedrdlock
- pthread_rwlock_timedwrlock
- sem_timedwait

and of course their callers. A similar issue applies to the internal
__lock, but as it spins a lot fewer times and is called only in more
controlled contexts, I'm not sure if it matters. The same mitigations
can probably be applied to it if needed.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.