|
Message-ID: <20240913152522.GA10433@brightrain.aerifal.cx> Date: Fri, 13 Sep 2024 11:25:23 -0400 From: Rich Felker <dalias@...c.org> To: Lukas Zeller <luz@...n44.ch> Cc: musl@...ts.openwall.com Subject: Re: SIGSEGV/stack overflow in pthread_create - race condition? On Fri, Sep 13, 2024 at 01:30:00PM +0200, Lukas Zeller wrote: > Hello list, > > I hope this is the right place to post the following. > > Using OpenWrt 22.03 with musl 1.2.3, *some* times, on *some* RPi devices (the faster, the more likely) I get the following: > > > Thread 2 "debugtarget" received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 4993.5022] > > 0xb6ec42f0 in pk_parse_kite_request () from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > > (gdb) bt > > #0 0xb6ec42f0 in pk_parse_kite_request () > > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > > #1 0xb6ec457c in pk_parse_pagekite_response () > > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > > #2 0xb6ec4b1c in pk_connect_ai () > > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > > #3 0xb6ec8494 in pkm_reconnect_all () > > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > > #4 0xb6ec79d4 in pkb_check_tunnels () > > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > > #5 0xb6ec7b94 in pkb_run_blocker () > > from /Volumes/CaseSens/openwrt-2/scripts/../staging_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/root-bcm27xx/usr/lib/libpagekite.so.1 > > #6 0xb6fd0af4 in start (p=0xb6adfd68) at src/thread/pthread_create.c:203 > > #7 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #8 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #9 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #10 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #11 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #12 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #13 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #14 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #15 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #16 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #17 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #18 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #19 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #20 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #21 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #22 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #23 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #24 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #25 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > #26 0xb6fcf22c in __clone () at src/thread/arm/clone.s:23 > > [... thousands of iterations ...] > > Searching the internet i found that this is not specific to my > setup, OpenWrt or libpagekite, but happens in different, otherwise > completely unrelated setups, such as > https://github.com/mikebrady/shairport-sync/issues/388 or > https://github.com/void-linux/void-packages/issues/980. > > I could not spot any conclusive findings - in the second example, > apparently they just made the stack bigger to "solve" it, which > indicates that maybe the race can come to a benign end eventually > and unwind the stack before it explodes. Why do you expect this is a race condition? The backtrace is not sufficient to show it, but my default assumption would just be that this is just a stack overflow in the application code, i.e. allocating too much on the stack (in automatic storage local variables). You can increase the default stack size at link time with -Wl,stack-size=N where N is the size you want (default 128k so increase from there), or make the program explicitly request the amount of space it needs with pthread attribute functions. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.