musl - Re: trouble spots for atomic access

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150519231510.GQ17573@brightrain.aerifal.cx>
Date: Tue, 19 May 2015 19:15:10 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: trouble spots for atomic access

On Wed, May 20, 2015 at 12:47:44AM +0200, Jens Gustedt wrote:
> > >  - pthread_once_t should always be volatile
> > >  - pthread_spinlock_t should always be volatile
> > 
> > These are C++ ABI changes. I'd like to make the change but I'm
> > concerned it might break things.
> 
> Both are broken as they are now, if you fall into a compiler that
> "knows" that the object itself isn't volatile qualified, and by that
> excuse takes the liberty to optimize out loads. For both types there
> is one point where there is a cast to (volatile int*) followed by a
> load, that might not do what we want.
> 
> (For pthread_once_t, this on line 43 in pthread_once.c)
> 
> I think the safest would be to make the data types volatile. If that
> is not possible, do the load with some new function "a_load_rel" that
> is guaranteed to be volatile, atomic and with memory_order relaxed.

This would require lots of changes and it would be easy to overlook
some. The main reason is that lots of the implementations of a_*
functions perform loads via C code inside their CAS loops and need the
load to be volatile. They'd all need to be changed to use a_load_rel,
and a_load_rel would in turn need to be added for all archs.

I have a slightly different approach: destroy the compiler's ability
to known the identity of the object being accessed by passing it
through:

volatile int *make_volatile(volatile int *p)
{
	__asm__ ( "" : "=r"(p) : "0"(p) );
	return p;
}

At this point, when we use the returned pointer, we are accessing an
object at an address returned by asm on which no analysis is possible,
And the address of the original object leaked into asm, so its
lifetime cannot end earlier than it would on the abstract machine.

> > >  - pthread_barrier needs atomic increment
> > 
> > Where?
> 
> I think all uses of count in pthread_barrier_wait should be
> atomic. The increments in lines 15 and 93 should be atomic_fetch_add.
> 
> Probably most archs do a ++ on a volatile as one memory operation, but
> nothing enforces that.

At line 93, the modification to inst->count happens with a lock held.
There is no concurrent access.

At line 15, I think there may be an issue; I need to look closer. But
it's not concurrent writes that need atomicity of the ++ as a RMW
operation; those are excluded by a lock. It's just the possibility of
another thread reading concurrently with the writes that I'm concerned
may happen.

> > I don't know a really good solution to this. The vast majority of uses
> > of tid are self->tid, from which perspective tid is not volatile or
> > even mutable. We don't want to pessimize them with volatile access.
> > Probably the best approach is to split it out into separate tid and
> > exit_futex fields.
> 
> Yes, this would probably be safer and cleaner.

I'm not sure if it's easy to do, though. There seem to be race
conditions, since ->tid is accessed from other threads for things like
pthread_kill and pthread_cancel. If we setup ->tid from start(), it
might not be seen by a pthread_kill or pthread_cancel made by the
caller of pthread_create after pthread_create returns. If we setup
->tid from pthread_create, it would not be seen by the new thread
immediately when using self->tid. If we setup ->tid from both places,
then we have a data race (concurrent writes) unless we make it atomic
(thus volatile) and then we're back to the original problem. (Note
that we don't have this race now because ->tid is set in the kernel
before clone returns in either the parent or the child.)

I think the above make_volatile trick would solve the problem without
adding a new field.

Alternatively, instead of ->tid and ->exit_futex, we could have ->tid
(used by self) and ->volatile_tid (used by other threads acting on the
target).

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.