musl - Re: Draft outline of thread-list design

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190212202355.GP23599@brightrain.aerifal.cx>
Date: Tue, 12 Feb 2019 15:23:55 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Draft outline of thread-list design

On Tue, Feb 12, 2019 at 01:26:25PM -0500, Rich Felker wrote:
> Here's a draft of the thread-list design, proposed previously as a
> better way to do dynamic TLS installation, and now as a solution to
> the problem of __synccall's use of /proc/self/task being (apparently
> hopelessly) broken:
> 
> 
> 
> Goal of simplicity and correctness, not micro-optimizing.
> 
> List lock is fully AS-safe. Taking lock requires signals be blocked.
> Could be an rwlock, where only thread creation and exit require the
> write lock, but this is not necessary for correctness, only as a
> possible optimization if other operations with high concurrency
> needing access would benefit.
> 
> 
> pthread_create:
> 
> Take lock, create new thread, on success add to list, unlock. New
> thread has new responsibility of unblocking signals, since it inherits
> a fully-blocked signal mask from the parent holding the lock. New
> thread should be created with its tid address equal to the thread list
> lock's address, so that set_tid_address never needs to be called
> later. This simplifies logic that previously had to be aware of detach
> state and adjust the exit futex address accordingly to be safe against
> clobbering freed memory.
> 
> pthread_exit:
> 
> Take lock. If this is the last thread, unlock and call exit(0).
> Otherwise, do cleanup work, set state to exiting, remove self from
> list. List will be unlocked when the kernel task exits. Unfortunately
> there can be a nontrivial (non-constant) amount of cleanup work to do
> if the thread left locks held, but since this should not happen in
> correct code, it probably doesn't matter.

It should be possible to eliminate the unbounded time the lock is held
and a lot of the serializing effects (but not all) by switching to
having two lists each with their own lock: live threads and exiting
threads. pthread_exit would start by moving the caller from the live
list to the exiting list, holding both locks. After that, the rest of
the function could run without any lock held until just before exit,
when the exiting thread list lock would need to be taken for the
thread to change its state to exited and remove itself from the list.

With this change, pthread_create and dlopen would only need to
synchronize against the live threads list, and pthread_join would only
need to synchronize against the exiting threads list. Only
pthread_exit and __synccall would need to synchronize with both.

I doubt this change makes sense though, at least not actually moving
the exiting thread to a new list. The expected amount of work between
the unlock and re-lock is much less than the cost of a lock cycle. We
could still use a separate exit lock to reduce the window during which
the live thread list lock has to be held, lowering the serializing
pressure on pthread_create, but I don't think there's actually any
advantage to having lower serialization pressure on create than on
exit, since they come in pairs.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.