|
Message-Id: <516E797A-8CD1-461D-8CC6-025BE9CBAD06@plan44.ch> Date: Fri, 13 Sep 2024 13:30:00 +0200 From: Lukas Zeller <luz@...n44.ch> To: musl@...ts.openwall.com Subject: SIGSEGV/stack overflow in pthread_create - race condition? Hello list, I hope this is the right place to post the following. Using OpenWrt 22.03 with musl 1.2.3, *some* times, on *some* RPi devices (the faster, the more likely) I get the following: > Thread 2 "debugtarget" received signal SIGSEGV, Segmentation fault. > [Switching to Thread 4993.5022] > 0xb6ec42f0 in pk_parse_kite_request () from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > (gdb) bt > #0 0xb6ec42f0 in pk_parse_kite_request () > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > #1 0xb6ec457c in pk_parse_pagekite_response () > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > #2 0xb6ec4b1c in pk_connect_ai () > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > #3 0xb6ec8494 in pkm_reconnect_all () > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > #4 0xb6ec79d4 in pkb_check_tunnels () > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > #5 0xb6ec7b94 in pkb_run_blocker () > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > #6 0xb6fd0af4 in start (p=0xb6adfd68) at src/thread/pthread_create.c:203 > #7 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #8 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #9 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #10 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #11 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #12 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #13 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #14 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #15 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #16 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #17 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #18 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #19 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #20 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #21 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #22 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #23 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #24 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #25 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > #26 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > [... thousands of iterations ...] Searching the internet i found that this is not specific to my setup, OpenWrt or libpagekite, but happens in different, otherwise completely unrelated setups, such as https://github.com/mikebrady/shairport-sync/issues/388 or https://github.com/void-linux/void-packages/issues/980. I could not spot any conclusive findings - in the second example, apparently they just made the stack bigger to "solve" it, which indicates that maybe the race can come to a benign end eventually and unwind the stack before it explodes. As I am aware musl 1.2.3 is not the current version, I applied the changes in pthread_create() between 1.2.3 and current master, which is only one commit, "d64148a - fix potential unsynchronized access to killlock state at thread exit". Applying this did not make any difference. Any ideas how to start digging deeper here? I guess I'm out of my depth here, neither familiar with musl internals (nor pagekitec's, to hack a workaround). Thanks in advance! Lukas -- Lukas Zeller, plan44.ch luz@...n44.ch - https://plan44.ch
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.