|
Message-ID: <20140813041109.GI12888@brightrain.aerifal.cx> Date: Wed, 13 Aug 2014 00:11:09 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: bug in pthread_cond_broadcast On Tue, Aug 12, 2014 at 07:33:10PM -0400, Rich Felker wrote: > One potential solution I have in mind is to get rid of this complex > waiter accounting by: > > 1. Having pthread_cond_broadcast set the new-waiter state on the mutex > when requeuing, so that the next unlock forces a futex wake that > otherwise might not happen. > > 2. Having pthread_cond_timedwait set the new-waiter state on the mutex > after relocking it, either unconditionally (this would be rather > expensive) or on some condition. One possible condition would be to > keep a counter in the condvar object for the number of waiters that > have been requeued, incrementing it by the number requeued at > broadcast time and decrementing it on each wake. However the latter > requires accessing the cond var memory in the return path for wait, > and I don't see any good way around this. Maybe there's a way to > use memory on the waiters' stacks? On further consideration, I don't think this works. If a thread other than one of the cv waiters happened to get the mutex first, it would fail to set the new-waiter state again at unlock time, and the waiters could get stuck never waking up. So I think it's really necessary to move the waiter count to the mutex. One way to do this with no synchronization cost at signal time would be to have waiters increment the mutex waiter count before waiting on the cv futex, but of course this could yield a lot of spurious futex wake syscalls for the mutex if other threads are locking and unlocking the mutex before the signal occurs. I think the other way you proposed is in some ways ideal, but also possibly unachievable. While the broadcasting thread can know how many threads it requeued, the requeued threads seem to have no way of knowing that they were requeued after the futex wait returns. Even after they were successfully requeued, the futex wait could return with a timeout or EINTR or similar, in which case there seems to be no way for the waiter to know whether it needs to decrement the mutex waiter count. I don't see any solution to this problem... Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.