|
Message-ID: <CAKAk8dY8L5Jt2Owhe18pm3HoHg+6q5HSMcRguYUHO5RCmeMsGw@mail.gmail.com> Date: Wed, 17 Oct 2012 01:39:49 +0200 From: boris brezillon <b.brezillon.musl@...il.com> To: musl@...ts.openwall.com Subject: Re: TLS (thread-local storage) support 2012/10/17 Rich Felker <dalias@...ifal.cx>: > On Tue, Oct 16, 2012 at 11:47:52PM +0200, boris brezillon wrote: >> 2012/10/16 boris brezillon <b.brezillon.musl@...il.com>: >> > Hi, >> > >> > First I'd like to thank Rich for adding TLS support (I started to work >> > on it a few weeks ago but never had time to finish it). >> > >> > 2012/10/6 Daniel Cegiełka <daniel.cegielka@...il.com>: >> >> 2012/10/5 Rich Felker <dalias@...ifal.cx>: >> >>> On Thu, Oct 04, 2012 at 11:29:11PM +0200, Daniel Cegiełka wrote: >> >>>> great news! Finally able to compile Go (lang)... >> >>> >> >>> Did Go fail with gcc's emulated TLS in libgcc? >> >> >> >> I tested Go with sabotage (with fresh musl). I'll try to do it again... >> >> gcc in sabotage was compiled without support for TLS, so I didn't >> >> expect that it will be successful: >> >> >> >> https://github.com/rofl0r/sabotage/blob/master/pkg/gcc4 >> >> >> > There's at least one thing (maybe more) missing for go support with >> > musl : gcc 'split-stack' support (see http://blog.nella.org/?p=849 and >> > http://gcc.gnu.org/wiki/SplitStacks). >> > >> > I'm also interested in split stack support in musl but for other >> > reasons (thread and coroutine stack automatic expansion). >> > >> > For x86/x86_64 split stack is implemented using a field inside the >> > pthread struct which is accessed via %fs (or %gs for x86_64) and an >> > offset. >> > >> > Currently this offset is defined at 0x30 (0x70 for x86_64) by the >> > TARGET_THREAD_SPLIT_STACK_OFFSET but only if TARGET_LIBC_PROVIDES_SSP >> > is defined (see gcc/config/i386/gnu-user.h or >> > gcc/config/i386/gnu-user64.h). >> > >> > As far as I know musl does not support stack protection, but we could >> > at least patch gcc to define TARGET_THREAD_SPLIT_STACK_OFFSET when >> > using musl. >> > >> > We also need to reserve a field in the musl pthread struct. There are >> > currently two fields named 'unused1' and 'unused2' but I'm not sure >> > they're really unused in every supported arch. >> > >> > >> > BTW, I'd like to work on a more integrated support of split stack in MUSL : > > I'm not a fan of split-stack for various reasons, but I have no > objection to adding support to make it work as long as it's an > optional feature that does not impair non-split-stack usage. > >> > 1) support in dynamic linker (see the last point of >> > http://gcc.gnu.org/wiki/SplitStacks) : check split stack notes in >> > shared libs (and program ?) > > It could be done, but is it really useful? There are infinitely many > ways you can crash a program with libraries that were not built > correctly for use with it. Checking for one of them seems like > gratuitous complexity with little benefit. > >> > 2) support in thread implementation : currently when a thread is >> > created the stack limit is set afterward (see >> > https://github.com/mirrors/gcc/blob/master/libgcc/generic-morestack-thread.c >> > and https://github.com/mirrors/gcc/blob/master/libgcc/config/i386/morestack.S) >> > and the stack size is supposed to be 16K (which is the minimum stack >> > size). This means we may reallocate a new stack chunk even if the >> > previous one (the first one) is not fully used. >> > If stack limit is set by thread implementation, this can be set >> > appropriately according to the stack size defined by the thread >> > creator. > > That's perfectly reasonable to support. > >> > 3) more optimizations I haven't thought about yet... >> > >> 4) Compile musl with '-fsplit-stack' and add no_split_stack attribute >> to appropriate functions (at least all functions called before >> pthread_self_init because %gs or %fs register is unusable before this >> call). > > This is definitely not desirable, at least not by default. It hurts > performance, possibly a lot, and destroys async-signal-safety. Also I > doubt it's needed. As long as split stack mode leaves at least ~8k > when calling a new function, most if not all functions in musl should > run fine without needing support for enlarging the stack. I agree. This should be made optional. But if we don't compile libc with fsplit-stack (-fnosplit-stack). Each call to a libc func from an external func compiled with split stack may lead to a 64K stack chunk alloc. > >> 5) set main thread stack limit to 0 (pthread_self_init) : the main >> thread stack grow is handled by the kernel. >> >> 6) add no-split-stack note to every asm file. > > I'm against this, or any boilerplate clutter. If it's really needed, > it should be possible with CFLAGS (or "ASFLAGS"), rather than > modifying every file, and if there's no way to do it with command line > options, that's a bug in gas. Not supported in gas, already tried. > > With that said, why would it be needed? I don't think there are any > asm files that use more than 32 bytes of stack... Same reason as 4) : 64K stack chunk allocation. > >> 7) make split stack support optional (either by checking the >> -fsplit-stack option in CFLAGS or with a specific option : >> --enable-split-stack) : split stack adds overhead to every functions >> (except for those with the 'no_split_stack' attribute). >> >> > Do you have any concern about adding those features in musl ? > > Basically, the whole idea of split-stack is antithetical to the QoI > guarantees of musl. A program using split-stack can crash at any time > due to out-of-memory, and there is no reliable/portable way to recover > from this condition. It's much like the following low-quality aspects > of glibc and default Linux config: The same program may crash because of stack overflow (segfault) or worst : corrupt memory. At best the split stack provides a way to increase the thread without crashing the whole process. At worst it crash the program but never corrupt the memory. > > - overcommit > - lazy allocation of libc-internal storage > - lazy/on-demand allocation of TLS > - dynamic loading of libgcc_s.so at runtime in pthread_cancel > - etc. > > On 64-bit machines, split-stack is 100% useless. You can get the same > behavior (crashing on OOM, but not having to know your stack size > ahead of time) by just turning on overcommit and using huge thread > stack sizes; the enormous 64-bit virtual address space makes it so you > don't have to worry about running out of virtual memory. > > On 32-bit machines where virtual addresses are a precious resource, > split-stack is a clever hack that essentially allows you to > over-commit not just physical memory but virtual memory too. But it's > inherently non-robust, and even worse than physical memory overcommit. > At least in the latter case, the kernel can be intelligent about > choosing an "abusive" process to kill. But if you run out of virtual > memory, nothing can be done but terminating the whole process (you > can't just terminate a single thread because it will leave resources > in an inconsistent state). > > As such, I'm willing to add whatever inexpensive support framework is > needed so that people who want to use split-stack can use it, but I'm > very wary of invasive or costly changes to support a feature which I > believe is fundamentally misguided (and, for 64-bit targets, utterly > useless). I understand. > > Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.