|
Message-ID: <20140812160838.GA7589@brightrain.aerifal.cx> Date: Tue, 12 Aug 2014 12:08:39 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Potential lock design with explicit hand-off Here are some ideas I just want to throw out for discussion for a possible alternate lock design which could be better or worse than what we have now. It's based on the thread with Jens where he raised a concern about futex wakes possibly being sent on addresses after they're no longer in use by the synchronization object. I already have a "simple" fix for that using FUTEX_WAKE_OP which we could adopt, but the source of the issue is that handing off a synchronization to another thread by marking it 'available' then sending a futex wake results in a "free for all" where any thread (e.g. even a newly arriving waiter for a mutex) can make the next action. What if, instead, we could make hand-off explicit and controlled? Here's the idea for how that's possible (for mutexes): In addition to "unlocked" state, add a state of "unowned, available to next waiter". The likely value for this state would be 0x80000000 (contention bit but no owner). A mutex in this state could not be acquired by trylock or a new lock attempt; it could only be acquired by a thread which has just returned from FUTEX_WAIT with a return value of 0. At unlock time, the unlocking thread checks for waiters before unlocking. If there seem to be no waiters, it makes an atomic cas from the "owned, uncontended" state to the "unlocked" state (0). This can fail if a new waiter arrives, in which case the other case is tried. If there are waiters, the state is set to "unowned, available to next waiter" and a FUTEX_WAKE call is made. The return value is checked. It should be 1, indicating that a thread was woken, in which case that thread's FUTEX_WAIT returns 0 and it's able to acquire the mutex. FUTEX_WAKE could return 0 however, e.g. if the waiter is stuck in a signal handler, or if it timed out (timedlock), or on certain race conditions. In this case, the unlocking thread has to transition the state atomically to "owned, uncontented" and start the process over (checking for waiters). If no further changes from other threads occur, it will end up changing the state to "unlocked" (and no FUTEX_WAKE is needed). Pros: - Provides guaranteed lock acquisition order. This is good from a standpoint of avoiding starvation. - Achieves the result of avoiding spurious futex wakes while using only the core futex API (wait and wake). Cons: - Provides guaranteed lock acquisition order. This is bad from a standpoint of forcing a lot more context switches when a thread repeatedly locks and unlocks a resource that other threads might also be waiting to lock. - Potentially breaks if it gets spurious futex wakes on the futex address, either due to bugs in the implementation or third-party code using futexes in the same process. And I think the breakage can lead to serious access-after-free errors, including writes. - Depends on detailed futex semantics and uses FUTEX_WAKE commands to convey actual state rather than simply conveying the need to re-check state. - Greatly complicates unlock operation with a retry loop involving syscalls. Based on the above-listed pros/cons, I think this design is a net loser and not a good candidate for inclusion in musl. But as I said I want to have it out as a published idea for the sake of discussion and comparison and understanding the trade-offs involved in implementing synchronization objects. And it may be an interesting customization that could be implemented for certain advanced realtime applications where out-of-order lock-stealing is a concern, although priority inheritance mutexes probably solve the problem already and solve it better. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.