|
Message-ID: <20181110233145.GA9199@darth.lan> Date: Sun, 11 Nov 2018 00:31:45 +0100 From: Sebastian Kemper <sebastian_ml@....net> To: musl@...ts.openwall.com Subject: SIGSEGV related to threads since 1.1.20? Hello all, I've got an issue with mariadb segfaulting. And apparently it has to do with the switch from musl 1.1.19 to 1.1.20. First off, I'm not a programmer, so the info below might be warped a bit. I maintain the mariadb package on OpenWrt. There was a report on the issues tracker about a segfault: https://github.com/openwrt/packages/issues/7230 I installed a current openwrt snapshot today, then installed mariadb-server. Afterwards I ran mysql_install_db --force --basedir=/usr to init the database. And then there was a segfault: Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144829] do_page_fault(): sending SIGSEGV to mysqld for invalid write access to 00000000 Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144839] epc = 77fc2058 in libc.so[77f4a000+93000] Sat Nov 10 23:41:08 2018 kern.info kernel: [17053.144863] ra = 77fc1fa0 in libc.so[77f4a000+93000] The messages look the same as in the report. Although the reporter used a different way to get to this result (he attempted to connect to the running server, whereas I tried to create a DB). This is on an old dlink router (mips_24kc, ar71xx). The reporter used something else (mips32r2, mir3g). I went and compiled mariadb with debug symbols and installed the unstripped binaries. Then I ran gdbserver on the mips device and connected to it from my laptop. When I ran the commands in gdb I got this output: (gdb) c Continuing. Thread 2 "mysqld" received signal SIGSEGV, Segmentation fault. __pthread_timedjoin_np (t=0x6bdced60, res=0x0, at=0x0) at src/thread/pthread_join.c:15 15 if (state >= DT_DETACHED) a_crash(); (gdb) bt #0 __pthread_timedjoin_np (t=0x6bdced60, res=0x0, at=0x0) at src/thread/pthread_join.c:15 #1 0x006bf754 in handle_bootstrap_impl (thd=<optimized out>) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:950 #2 0x006bfd58 in do_handle_bootstrap (thd=<optimized out>) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:1094 #3 0x006bfdfc in handle_bootstrap (arg=0x1dc7448) at /home/sk/tmp/openwrt/build_dir/target-mips_24kc_musl/mariadb-10.2.17/sql/sql_parse.cc:1077 #4 0x77fd10fc in start (p=0x77fd10fc <start+100>) at src/thread/pthread_create.c:147 #5 0x77f6702c in __clone () at src/thread/mips/clone.s:32 Backtrace stopped: frame did not save the PC So apparently __pthread_timedjoin_np gets some NULL input and then the program segfaults. I reran this with a breakpoint on the function and it got called before the segfault and in these calls the args were not NULL. Anyway. I checked on openwrt's github what happened to musl in the past months. And on Sep 21 musl was upgraded from 1.1.19 to 1.1.20. So I reverted this commit and compiled 1.1.19. I then just downgraded musl on the router (on-the-fly). That caused some programs like dropbear to stop working properly due to missing symbols. OK, expected. But when I ran mysql_install_db --force --basedir=/usr it completed without errors. And once I upgraded to musl 1.1.20 I got the segfault again. I was hoping that maybe you could take a look at this :) Kind regards, Seb
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.