Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190924232246.GS9017@brightrain.aerifal.cx>
Date: Tue, 24 Sep 2019 19:22:46 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Bug report, concurrency issue on exception with gcc 8.3.0

On Wed, Sep 18, 2019 at 02:45:51PM +0200, Max Neunhoeffer wrote:
> Hello,
> 
> thank you very much for the explanation. This gives me a temporary way
> to fix up our application until the bug has been fixed.

I'm adding the attached patch to musl-cross-make; it should fix the
issue adequately on the gcc side.

Rich


> On 19/09/18 11:21, Szabolcs Nagy wrote:
> > * Max Neunhoeffer <max@...ngodb.com> [2019-09-18 09:19:31 +0200]:
> > > thanks for the quick response and for lobbying with the gcc folks!
> > > 
> > > Did you see the second example program in the original bug report? This
> > > seems to indicate that there might be an additional problem, since when
> > > I explicitly use `pthread_cancel` (thereby circumventing the detection
> > > problem), I get a crash when the first exception is thrown.
> > 
> > pthread_cancel does not solve the detection problem.
> > 
> > reference to pthread_cancel only helps with dynamic linking.
> > in case of static linking you have to explicitly add (strong)
> > reference to symbols that libgcc_eh.a uses:
> > 
> > pthread_cancel
> > pthread_getspecific
> > pthread_key_create
> > pthread_mutex_lock
> > pthread_mutex_unlock
> > pthread_once
> > pthread_setspecific
> > 
> > where pthread_cancel is only needed to make libgcc_eh.a call the
> > thread functions (but those are all weakrefs so will just be 0
> > at runtime unless there are other strong references to them).
> > 
> > > 
> > > Do you think this is a libgcc problem, too? Should I report this to the
> > > gcc bug tracker as well?
> > > 
> > > Cheers,
> > >   Max.
> > > 
> > > On 19/09/17 10:35, Rich Felker wrote:
> > > > On Tue, Sep 17, 2019 at 10:02:27AM -0400, Rich Felker wrote:
> > > > > On Tue, Sep 17, 2019 at 03:44:22PM +0200, Max Neunhoeffer wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > I am experiencing problems when linking a large multithreaded C++ application
> > > > > > statically against libmusl. I am using Alpine Linux 3.10.1 and gcc 8.3.0
> > > > > > on X86_64. That is, I am using libmusl 1.1.22-r3 (Alpine Linux versioning)
> > > > > > and gcc 8.3.0-r0.
> > > > > > 
> > > > > > Before going into details, here is an overview:
> > > > > > 
> > > > > > 1. libgcc does not detect correctly that the application is multithreaded,
> > > > > >    since `pthread_cancel` is not linked into the executable.
> > > > > >    As a consequence, the lazy initialization of data structures for stack
> > > > > >    unwinding (FDE tables) is executed without protection of a mutex.
> > > > > >    Therefore, if the very first exception in the program happens to be
> > > > > >    thrown in two threads concurrently, the data structures can be corrupted,
> > > > > >    resulting in a busy loop after `main()` is finished.
> > > > > > 2. If I make sure that I explicitly link in `pthread_cancel` this problem
> > > > > >    is (almost certainly) gone, however, in certain scenarios this leads
> > > > > >    to a crash when the first exception is thrown.
> > > > > > 
> > > > > > I had first reported this problem to gcc as a bug against libgcc, but the
> > > > > > gcc team denies responsibility, see 
> > > > > > [this bug report](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91737).
> > > > > 
> > > > > This is a gcc bug and needs to be fixed in libgcc.
> > > > 
> > > > I've updated the gcc tracker with more info, but I seem to lack the
> > > > ability to reopen the bug myself.
> > > > 
> > > > To add some more context, using weak references to determine if a
> > > > library is linked is a dynamic-linking-centric hack and is not
> > > > compatible with static linking. GCC has historically done this for
> > > > glibc and other systems where libpthread was a separate library to
> > > > avoid pulling in a dependency on it, but it's always been broken on
> > > > glibc with static linking too. Various distros worked around this with
> > > > horrible hacks as described in Andrew Pinski's reply to your bug
> > > > report, using binutils tricks to move the whole libpthread.a into a
> > > > single .o file so that if any of it gets linked it all gets linked.
> > > > It's possibly upstream glibc adopted this at some point; I'm not sure.
> > > > But they're in the process of moving the mutex functions to libc
> > > > instead of libpthread (and maybe even getting rid of libpthread like
> > > > musl does), so GCC's hacks here won't even provide any benefit with
> > > > future glibc versions.
> > > > 
> > > > In any case, this kind of pushback against fixes for clear bugs used
> > > > to be expected, but things have gotten a lot better with musl being
> > > > more mainstream nowadays. I think the issue will get resolved quickly
> > > > once a few more GCC developers look at it. It was actually just
> > > > reopened while I was writing this email.
> > > > 
> > > > Rich

View attachment "0001-fix-gthr-weak-refs-for-libgcc.patch" of type "text/plain" (1677 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.