|
Message-ID: <20140813003011.GH12888@brightrain.aerifal.cx> Date: Tue, 12 Aug 2014 20:30:11 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: bug in pthread_cond_broadcast On Wed, Aug 13, 2014 at 12:50:19AM +0200, Jens Gustedt wrote: > > I'd like to find a fix that > > would be acceptable in the 1.0.x branch and make that fix before > > possibly re-doing the cond var implementation (a fix which wouldn't be > > suitable for backporting). > > Some thoughts: > > Basically, in "unwait" there shouldn't be any reference to c-> . No > pending thread inside timedwait should ever have to access the > pthread_cond_t, again, it might already heavily used by other threads. As far as I can see, there must be: since "unwait" potentially releases the association of the mutex with the cond var, "unwait" and broadcast need to mutually exclude one another, so that broadcast can know whether there are zero waiters (in which case the mutex can legally be destroyed by the last waiter, and broadcast cannot access it) or at least one waiter that cannot re-acquire the mutex until the broadcast is finished. The only way I can see around this "must" is to do away with requeue entirely and have broadcast wake all waiters, never inspecting the mutex at all. This is certainly a lot simpler (it's what we do for process-shared cond vars anyway) but performance is much worse. > The signalling or broacasting thread (waker) should do most of the > bookkeeping on the waiters counts. This might be done by > > - lock _c_lock > > - if there are no waiters, unlock _c_lock and quit > > - requeue the wanted number of threads (1 or everybody) from the cnd > to the mtx. requeue tells us how many threads have been requeued, > and this lets us deduce the number of threads that have been woken > up. If you requeue here, where does any wake happen? > - verify that all wanted waiters are in, otherwise repeat the requeue > operation. (this should be a rare event) This step is not possible. One or more waiters could be in signal handlers which interrupted the wait, in which case the futex wait will not resume until the signal handler returns. Such a retry loop could run forever (e.g. if the signal handler is waiting for an event that will only be performed by the [cond-var-]signaling thread after the operation finishes). > - do the bookkeeping: update the cond-waiters count and add the right > amount to the mtx-waiters > > - unlock _c_lock > > On the waiter side, you'd have to distinguish a successful wakeup by a > waker from a spurious wakeup. Only for the later the waiter has to do > the bookkeeping. This can only happen as long as the waker is in the > "requeue" loop. I don't understand what you mean. > The only disadvantage that I see with such a procedure is that the > waker is holding _c_lock when going into the futex call. This is probably a small issue compared to everything else. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.