musl - Transition path for removing lazy init of thread pointer

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140324174915.GA1263@brightrain.aerifal.cx>
Date: Mon, 24 Mar 2014 13:49:15 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Transition path for removing lazy init of thread pointer

I've begun looking at removing lazy-init of the thread pointer, one of
the big post-1.0 development items. Since I'm still a bit worried
about whether we can ensure that there's always a valid thread pointer
without dropping support for some old kernels (which were not
officially supported ever, but which de-facto worked for some apps),
I've come up with the following conservative roadmap:

Phase 1: Initialize the thread pointer at startup, but do not make
anything in musl assume that the thread pointer is valid unless the
affected code was already assuming the "or" statement: either the
thread pointer is already setup, or pthread_self() will successfully
set it up. In particular, this phase will leave __errno_location using
its own static errno location for the main thread rather than always
accessing errno via the thread pointer. This is helpful for the
dynamic linker anyway, since we have code that accesses errno early,
and otherwise the dynamic linker would need to first initialize a
potentially-fake thread pointer at startup, then switch to a new one
once the initial TLS size is known. During this phase, we should also
try to make sure there is a new mechanism to ensure that
pthread_create fails (rather than making a corrupt program state) when
the kernel does not support threads; currently, that's done by
checking for failure of pthread_self(), which will no longer be the
correct way.

Phase 2: Add workarounds to setup a valid thread-pointer on older
kernels that don't directly support it. On i386 and perhaps x86_64,
this could be done with modify_ldt, which has been around since
forever. (modify_ldt can't, AFAIK, create a thread-local %fs/%gs
mapping, but it can make a valid segment for just the main thread to
use.) I'm not sure about other archs though, and whether there are any
that matter where we can't set the thread register from userspace.

Phase 3: If we can ensure to a satisfactory degree that thread pointer
setup never fails, optimize musl-internal things (like errno) to
assume he thread pointer is available. This may require special care
for the early stages of the dynamic linker. Assuming thread pointer
validity is necessary for stack-protector anyway on many archs, and it
would allow us to support building libc itself with stack-protector.

I'm mostly done with phase 1, but right now it includes an ugly mix of
some elements from phases 2 and 3 that shouldn't be touched yet, and
it's missing error checks and code for dealing with what happens when
the thread pointer setup fails (whether this is fatal should, at phase
1, depend on whether the program/libs need TLS).

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.