|
Message-ID: <583865e860854d059f50d23a9acd7a8f@ispras.ru> Date: Wed, 14 Dec 2022 13:29:54 +0300 From: Alexey Izbyshev <izbyshev@...ras.ru> To: musl@...ts.openwall.com Subject: Re: sem_post() can miss waiters On 2022-12-14 04:48, Rich Felker wrote: > On Tue, Nov 22, 2022 at 08:41:44AM +0300, Alexey Izbyshev wrote: >> On 2022-11-22 03:09, Rich Felker wrote: >> >If I understand correctly, the cost of the first option is generally >> >an extra "last" broadcast wake that shouldn't be needed. Is that >> >right? >> > >> >For example, if sem starts with count 0 and thread A calls wait, then >> >thread B calls post twice, both posts make a syscall even though only >> >the first one should have. >> > >> Yes, exactly. >> >> >What if you instead perform the broadcast wake when the observed >> >waiters count is <=1 rather than ==0? This should have no cost in the >> >common case, but in the race case, I think it forces any hiding >> >(just-arrived) extra waiters to wake, fail, and re-publish their >> >waiting status to the waiters bit. >> > >> Indeed, I think this solves the overhead issue quite nicely, thanks. >> So sem_post wake up logic would basically boil down to this: >> >> * WAITERS_BIT is set and waiters > 1: don't reset WAITERS_BIT since >> it's likely that some waiters will remain (and it's always fine to >> err on the side of preserving the WAITERS_BIT); wake a single >> waiter. >> >> * WAITERS_BIT is set and waiters <= 1: reset WAITERS_BIT and wake >> all waiters. In non-racy cases only a single waiter will be woken. >> >> * WAITERS_BIT is unset: don't wake anybody. Even if there are some >> waiters, another sem_post (that reset the WAITERS_BIT) is >> responsible for waking them. >> >> In code: >> >> int val, new, waiters, priv = sem->__val[2]; >> do { >> val = sem->__val[0]; >> waiters = sem->__val[1]; >> if ((val & SEM_VALUE_MAX) == SEM_VALUE_MAX) { >> errno = EOVERFLOW; >> return -1; >> } >> new = val + 1; >> if (waiters <= 1) >> new &= ~0x80000000; >> } while (a_cas(sem->__val, val, new) != val); >> if (val<0) __wake(sem->__val, waiters <= 1 ? -1 : 1, priv); > > Yes, this looks good to me. How is the attached patch? > The patch looks good to me. >> >I think there's another really stupid solution too that I would even >> >consider, partly motivated by the fact that, with long-lived >> >process-shared semaphores, the waiters count can become stale if >> >waiters are killed. (Note: semaphores aren't required to be robust >> >against this, but it's ugly that they're not.) This solution is just >> >to keep the current code, but drop the waiters count entirely, and use >> >broadcast wakes for all wakes. Then, any still-live waiters are >> >*always* responsible for re-asserting their waiting status when all >> >but one fail to acquire the semaphore after a wake. Of course this is >> >a thundering herd, which is arguably something we should not want, but >> >we accept it for correctness in several other places like condvars... >> > >> I'm not sure that the kind of "partial robustness" that could be >> achieved by always waking all waiters is worth punishing normal >> cases. Also, I don't think waiters count being stale is an issue per >> se because it can be wrong only in one way (greater than the real >> count of waiters), assuming you don't mean overflow (but overflow >> could be handled by simply pinning it at INT_MAX permanently). But >> many other issues are possible if we allow killing processes at >> arbitrary points, including sem_post not sending a wake up >> notification after updating the semaphore value, or sem_wait >> consuming a notification but not updating the value. Granted, some >> of these issues may "self-heal" on a future sem_wait/sem_post (in >> the sense that the semaphore will leave a forbidden state), but they >> might still interfere with the program shutdown logic (e.g. if the >> program expects to claim the semaphore N times at the end without >> being aware that some consumers are dead), and, of course, with any >> logic outside of semaphore manipulation. So it seems to me that >> either the program already watches for death of processes it cares >> about, or it's not even clear that allowing further progress (due to >> a broadcast) is always more desirable than blocking. > > Indeed, this seems like it's just a bad direction to go for the > reasons you described. One thing I'd like to note here though, but not > act on at this time: > > When we were designing the semaphore long ago, there was a vague > proposal to make the post operation responsible for removing waiter > counts, rather than the waiter. It was thrown out because it didn't > seem to work. But if we ever did have a reason to want to do this, I > think it's possible now, since it's essentially just a "countdown to > broadcast wake" with no constraint that it be an accurate count, and > the new waiters bit is what really controls whether wakes happen. > Indeed, now the waiters count is just a hint used for resetting the waiters bit. However, its increments/decrements must still be balanced for it to remain useful, and I don't see an easy way to achieve that if sem_post is responsible for decrementing it. One further note: after this bug report I'd been pointed to a semaphore redesign proposal from 2015 (starts in https://www.openwall.com/lists/musl/2015/02/27/1 after some off-list discussion, ends in April). In that design, the waiters count is stored in the semaphore word, and sem_post is indeed responsible for adjusting it. However, another counter ("wake count") is still needed to make the semaphore work. Thanks, Alexey
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.