Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190330143939.GI23599@brightrain.aerifal.cx>
Date: Sat, 30 Mar 2019 10:39:39 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Does TD point to itself intentionally?

On Sat, Mar 30, 2019 at 11:38:14AM +0100, Markus Wichmann wrote:
> Hi all,
> 
> I was looking over my old C experiments and saw an old file, trying to
> use clang's address_space attribute to access something like a thread
> pointer. That made me wonder how it is implemented in musl.

I've experimented with using the equivalent in GCC to get musl to
generate %gs:offset or %fs:offset for access to fields in the thread
structure. Unfortunately you need -fasm or they silently don't work --
see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626 for details. It
does help code generation somewhat and gave measurable performance
benefits in microbenchmarks (mainly due to reducing register
pressure), but would require making separate __self() or something
that returns the address-spaced pointer whose value is not valid for
assignment to pointers or passing as an argment like __pthread_self()
needs to be. Also, experiments showed that GCC generated multiple
instances of __self() on archs where the asm to load the thread
pointer was actually more expensive than caching the result in a
register. This was able to be partly mitigated by adding some \n\n\n
to the asm... *facepalm*

> In most architectures, the thread pointer is just stored in a register,
> and __pthread_self() will just grab it out of there. For x86_64,
> something slightly similar happens: The thread pointer is stored in
> FS.base, which is an MSR the kernel has to set for us, but we can read
> it with FS-relative addressing.
> 
> Incidentally: Is there any interest in using the "wrfsbase" instruction
> for that, where available? From a cursory first glance, it looks like
> that would mean that musl would have to do the entire CPUID dance on
> AMD64 and i386, and in the latter case the dance would be a bit longer
> since the ID bit dance would have to preceed it.

No. Even a single insn to test the stored result of whether such a
feature is available (in practice it would take several and a branch)
is more expensive than loading from %fs:0. And even without having to
make a runtime test, it should be the same cost, possibly still more
expensive, than loading from %fs:0.

> Back to setting the thread pointer: The relevant code is in __init_tp(),
> which is always called with the return value from __copy_tls(), which
> points to the new thread descriptor. __init_tp() will then call
> __set_thread_area() with the adjusted thread pointer, and on AMD64, this
> will just call arch_prctl(SET_FS, p). Though I don't know why that
> function has to be in assembly.
> 
> OK, got it. After this, FS.base will point directly at the TD, so we can
> just load FS.base into any register and have a thread pointer, right?
> Enter __pthread_self():
> 
> static inline struct pthread *__pthread_self()
> {
> 	struct pthread *self;
> 	__asm__ ("mov %%fs:0,%0" : "=r" (self) );
> 	return self;
> }
> 
> But that is not the same thing! This will load FS.base, and then
> dereference it and load the qword it is pointing at into a register. So
> how did this ever work? Well, the answer is back in __init_tp():
> 
> 	td->self = td;
> 
> And of course, "self" is the first member of struct pthread.
> 
> So, now the question I've been building up to: Is that intentional? Is

Yes, this is intentional. It's the documented ABI for x86[_64], and
necessary for the operation of code generated by a compiler
conforming to the ABI that takes &tlsvar via the initial-exec or
local-exec model.

> there a reason for there to be a pointer pointing to itself, other than
> the "mov" in __pthread_self()? Could that mov not be replaced with a
> "lea" and save one useless memory access?

The effective address computed by lea would be relative to %fs or %gs.
It's not useful.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.