|
Message-ID: <20181110234259.GH5150@brightrain.aerifal.cx> Date: Sat, 10 Nov 2018 18:42:59 -0500 From: Rich Felker <dalias@...c.org> To: Sebastian Kemper <sebastian_ml@....net> Cc: musl@...ts.openwall.com Subject: Re: SIGSEGV related to threads since 1.1.20? On Sun, Nov 11, 2018 at 12:31:45AM +0100, Sebastian Kemper wrote: > Hello all, > > I've got an issue with mariadb segfaulting. And apparently it has to do > with the switch from musl 1.1.19 to 1.1.20. > > First off, I'm not a programmer, so the info below might be warped a > bit. > > I maintain the mariadb package on OpenWrt. There was a report on the > issues tracker about a segfault: > https://github.com/openwrt/packages/issues/7230 > > I installed a current openwrt snapshot today, then installed > mariadb-server. Afterwards I ran > > mysql_install_db --force --basedir=/usr > > to init the database. And then there was a segfault: > > Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144829] do_page_fault(): sending SIGSEGV to mysqld for invalid write access to 00000000 > Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144839] epc = 77fc2058 in libc.so[77f4a000+93000] > Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144863] ra = 77fc1fa0 in libc.so[77f4a000+93000] > > The messages look the same as in the report. Although the reporter used > a different way to get to this result (he attempted to connect to the > running server, whereas I tried to create a DB). > > This is on an old dlink router (mips_24kc, ar71xx). The reporter used > something else (mips32r2, mir3g). > > I went and compiled mariadb with debug symbols and installed the > unstripped binaries. Then I ran gdbserver on the mips device and > connected to it from my laptop. When I ran the commands in gdb I got > this output: > > (gdb) c > Continuing. > > Thread 2 "mysqld" received signal SIGSEGV, Segmentation fault. > __pthread_timedjoin_np (t=0x6bdced60, res=0x0, at=0x0) at src/thread/pthread_join.c:15 > 15 if (state >= DT_DETACHED) a_crash(); > (gdb) bt > #0 __pthread_timedjoin_np (t=0x6bdced60, res=0x0, at=0x0) at src/thread/pthread_join.c:15 > #1 0x006bf754 in handle_bootstrap_impl (thd=<optimized out>) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:950 > #2 0x006bfd58 in do_handle_bootstrap (thd=<optimized out>) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:1094 > #3 0x006bfdfc in handle_bootstrap (arg=0x1dc7448) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:1077 > #4 0x77fd10fc in start (p=0x77fd10fc <start+100>) at src/thread/pthread_create.c:147 > #5 0x77f6702c in __clone () at src/thread/mips/clone.s:32 > Backtrace stopped: frame did not save the PC > > So apparently __pthread_timedjoin_np gets some NULL input and then the > program segfaults. I reran this with a breakpoint on the function and it > got called before the segfault and in these calls the args were not > NULL. This it an intentional trap for undefined behavior when the caller attempts to join a detached thread or detach a thread that was not joinable (already detached or already being joined by another thread). In the case of mariadb, it was reported as: https://jira.mariadb.org/browse/MDEV-17200 and the corresponding Alping Linux bug: https://bugs.alpinelinux.org/issues/9407 The patch is available in Alpine Linux's aport repo: https://git.alpinelinux.org/cgit/aports/tree/main/mariadb/fix-pthread-detach.patch Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.