Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140812171941.GA12888@brightrain.aerifal.cx>
Date: Tue, 12 Aug 2014 13:19:41 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: bug in pthread_cond_broadcast

On Tue, Aug 12, 2014 at 06:50:34PM +0200, Szabolcs Nagy wrote:
> >       trace("thread %u is last, signalling main, %s\n", *number, errorstring(ret));
> >     }
> >     while (i == phase) {
> >       tell("thread %u in phase %u (%u), waiting\n", *number, i, phase);
> >       int ret = condition_wait(&cond_client, &mut[i]);
> >       trace("thread %u in phase %u (%u), finished, %s\n", *number, i, phase, errorstring(ret));
> 
> the last client thread will wait here unlocking mut[0] so
> the main thread can continue
> 
> the main thread broadcast wakes all clients while holding both
> mut[0] and mut[1] then unlocks mut[0] and starts waiting on
> cond_main using mut[1]
> 
> the awaken clients will go into the next phase locking mut[1]
> and waiting on cond_client using mut[1]
> 
> however there might be still clients waiting on cond_client
> using mut[0] (eg. the broadcast is not yet finished)

A waiter cannot assume broadcast was finished (or that it was even
performed) just because it's returned from the wait. Waits are always
subject to spurious wakes, and a spurious wake is indistinguishable
from a non-spurious one without additional synchronization and
checking of the predicate. So, while I still haven't read the test
case in detail, I'm suspicious that it might actually be invalid...

> i see logs where one thread is already in phase 1 (using mut[1])
> while another is not yet out of condition_wait (using mut[0]):
> 
> pthread_cond_smasher.c:120: thread 3 in phase 1 (1), waiting
> pthread_cond_smasher.c:122: thread 6 in phase 0 (1), finished, No error information
> 
> "When a thread waits on a condition variable, having specified a particular
>  mutex to either the pthread_cond_timedwait() or the pthread_cond_wait()
>  operation, a dynamic binding is formed between that mutex and condition
>  variable that remains in effect as long as at least one thread is blocked
>  on the condition variable. During this time, the effect of an attempt by
>  any thread to wait on that condition variable using a different mutex is
>  undefined. "
> 
> so are all clients considered unblocked after a broadcast?

Once broadcast returns (as observed by the thread which called
broadcast, or any thread that synchronizes with this thread after
broadcast returns), there are no waiters and it's valid to use a new
mutex with the cond var (or destroy it if it won't be used again).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.