|
Message-ID: <20190330143939.GI23599@brightrain.aerifal.cx> Date: Sat, 30 Mar 2019 10:39:39 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Does TD point to itself intentionally? On Sat, Mar 30, 2019 at 11:38:14AM +0100, Markus Wichmann wrote: > Hi all, > > I was looking over my old C experiments and saw an old file, trying to > use clang's address_space attribute to access something like a thread > pointer. That made me wonder how it is implemented in musl. I've experimented with using the equivalent in GCC to get musl to generate %gs:offset or %fs:offset for access to fields in the thread structure. Unfortunately you need -fasm or they silently don't work -- see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626 for details. It does help code generation somewhat and gave measurable performance benefits in microbenchmarks (mainly due to reducing register pressure), but would require making separate __self() or something that returns the address-spaced pointer whose value is not valid for assignment to pointers or passing as an argment like __pthread_self() needs to be. Also, experiments showed that GCC generated multiple instances of __self() on archs where the asm to load the thread pointer was actually more expensive than caching the result in a register. This was able to be partly mitigated by adding some \n\n\n to the asm... *facepalm* > In most architectures, the thread pointer is just stored in a register, > and __pthread_self() will just grab it out of there. For x86_64, > something slightly similar happens: The thread pointer is stored in > FS.base, which is an MSR the kernel has to set for us, but we can read > it with FS-relative addressing. > > Incidentally: Is there any interest in using the "wrfsbase" instruction > for that, where available? From a cursory first glance, it looks like > that would mean that musl would have to do the entire CPUID dance on > AMD64 and i386, and in the latter case the dance would be a bit longer > since the ID bit dance would have to preceed it. No. Even a single insn to test the stored result of whether such a feature is available (in practice it would take several and a branch) is more expensive than loading from %fs:0. And even without having to make a runtime test, it should be the same cost, possibly still more expensive, than loading from %fs:0. > Back to setting the thread pointer: The relevant code is in __init_tp(), > which is always called with the return value from __copy_tls(), which > points to the new thread descriptor. __init_tp() will then call > __set_thread_area() with the adjusted thread pointer, and on AMD64, this > will just call arch_prctl(SET_FS, p). Though I don't know why that > function has to be in assembly. > > OK, got it. After this, FS.base will point directly at the TD, so we can > just load FS.base into any register and have a thread pointer, right? > Enter __pthread_self(): > > static inline struct pthread *__pthread_self() > { > struct pthread *self; > __asm__ ("mov %%fs:0,%0" : "=r" (self) ); > return self; > } > > But that is not the same thing! This will load FS.base, and then > dereference it and load the qword it is pointing at into a register. So > how did this ever work? Well, the answer is back in __init_tp(): > > td->self = td; > > And of course, "self" is the first member of struct pthread. > > So, now the question I've been building up to: Is that intentional? Is Yes, this is intentional. It's the documented ABI for x86[_64], and necessary for the operation of code generated by a compiler conforming to the ABI that takes &tlsvar via the initial-exec or local-exec model. > there a reason for there to be a pointer pointing to itself, other than > the "mov" in __pthread_self()? Could that mov not be replaced with a > "lea" and save one useless memory access? The effective address computed by lea would be relative to %fs or %gs. It's not useful. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.