|
Message-ID: <20190917140227.GW9017@brightrain.aerifal.cx> Date: Tue, 17 Sep 2019 10:02:27 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Bug report, concurrency issue on exception with gcc 8.3.0 On Tue, Sep 17, 2019 at 03:44:22PM +0200, Max Neunhoeffer wrote: > Hello, > > I am experiencing problems when linking a large multithreaded C++ application > statically against libmusl. I am using Alpine Linux 3.10.1 and gcc 8.3.0 > on X86_64. That is, I am using libmusl 1.1.22-r3 (Alpine Linux versioning) > and gcc 8.3.0-r0. > > Before going into details, here is an overview: > > 1. libgcc does not detect correctly that the application is multithreaded, > since `pthread_cancel` is not linked into the executable. > As a consequence, the lazy initialization of data structures for stack > unwinding (FDE tables) is executed without protection of a mutex. > Therefore, if the very first exception in the program happens to be > thrown in two threads concurrently, the data structures can be corrupted, > resulting in a busy loop after `main()` is finished. > 2. If I make sure that I explicitly link in `pthread_cancel` this problem > is (almost certainly) gone, however, in certain scenarios this leads > to a crash when the first exception is thrown. > > I had first reported this problem to gcc as a bug against libgcc, but the > gcc team denies responsibility, see > [this bug report](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91737). This is a gcc bug and needs to be fixed in libgcc. Rich > I have produced small sample programs to exhibit the problems, see below for > a more detailed analysis as to what happens. > > For case 1: > > ------------------------ snip exceptioncollision.cpp ---------------------- > #include <thread> > #include <atomic> > #include <chrono> > > std::atomic<int> letsgo{0}; > > void waiter() { > size_t count = 0; > while (letsgo == 0) { > ++count; > } > try { > throw 42; > } catch (int const& s) { > } > } > > int main(int, char*[]) { > #ifdef REPAIR > try { throw 42; } catch (int const& i) {} > #endif > std::thread t1(waiter); > std::thread t2(waiter); > std::this_thread::sleep_for(std::chrono::milliseconds(10)); > letsgo = 1; > t1.join(); > t2.join(); > return 0; > } > ------------------------ snip exceptioncollision.cpp ---------------------- > > Use Alpine Linux 3.10.1, for example in a Docker container, and compile > as follows: > > g++ exceptioncollision.cpp -o exceptioncollision -O0 -Wall -std=c++14 -lpthread -static > > Then execute the static executable multiple times: > > while true ; do ./exceptioncollision ; date ; done > > after a few tries it will freeze. > > > For case 2: > > ----------------------------------- snip exceptionbang.cpp --------------- > #include <pthread.h> > //#include <iostream> > > #ifdef REPAIR > void* g(void *p) { > return p; > } > > void f() { > pthread_t t; > pthread_create(&t, nullptr, g, nullptr); > pthread_cancel(t); > pthread_join(t, nullptr); > } > #endif > > int main(int argc, char*[]) { > #ifdef REPAIR > if (argc == -1) { f(); } > #endif > //std::cout << "Hello world!" << std::endl; > try { throw 42; } catch(int const& i) {}; > return 0; > } > ----------------------------------- snip exceptionbang.cpp --------------- > > Use Alpine Linux 3.10.1, for example in a Docker container, and compile > as follows: > > g++ exceptionbang.cpp -o exceptionbang -Wall -Wextra -O0 -g -std=c++14 -static -DREPAIR=1 > > Execute `./exceptionbang` and it will create a segmentation violation. > > Curiously, if you uncomment the line > > //#include <iostream> > > then more of static initialization code seems to be compiled in and > all is well. > > More detailed analysis of what is happening: > > Let's look at case 1 first: > > libgcc insists that it is a good idea to check for the presence of > `pthread_cancel` to detect if the application is multi-threaded. Therefore, > in my case, since I do not explicitly use `pthread_cancel` and am > linking statically, the libgcc runtime thinks that the program is > single-threaded (since `pthread_cancel` is in its own compilation > unit). As a consequence the mutex > [here](https://github.com/gcc-mirror/gcc/blob/4ac50a4913ed81cc83a8baf865e49a2c62a5fe5d/libgcc/unwind-dw2-fde.c#L1045) is not actually used. > > Therefore some code in `libgcc`, which is executed when an exception is > first thrown in the life of the process ([see here](https://github.com/gcc-mirror/gcc/blob/4ac50a4913ed81cc83a8baf865e49a2c62a5fe5d/libgcc/unwind-dw2-fde.c#L1072)) > is not thread-safe and ruins the data structure `seen_objects` rendering > a singly linked list circular. > > This in the end leads to a busy loop [here](https://github.com/gcc-mirror/gcc/blob/4ac50a4913ed81cc83a8baf865e49a2c62a5fe5d/libgcc/unwind-dw2-fde.c#L221). > > > No let's look at case 2: > > I tried to "fix" this by using `pthread_cancel` explicitly. This is how > I arrived at the second example program `exceptionbang.cpp`. Here, the > detection is successful detecting a multi-threaded program. However, > it crashes when the first exception is thrown. I do not understand the > details, but it seems that the libgcc runtime code stumbles over some > data structures which are not properly initialized. When including the > header `iostream`, some more code is compiled in which initializes the > structures and all is well. > > > Please let me know if you need any more information and please Cc me in > communication about this issue. > > Cheers, > Max.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.