Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D05A2383-DE06-4385-BB98-199DF5BFBA3C@plan44.ch>
Date: Sat, 14 Sep 2024 13:27:40 +0200
From: Lukas Zeller <luz@...n44.ch>
To: Rich Felker <dalias@...c.org>,
 musl@...ts.openwall.com
Subject: Re: SIGSEGV/stack overflow in pthread_create - race condition?

> On 13 Sep 2024, at 21:54, Rich Felker <dalias@...c.org> wrote:
> 
> [...]
> Can you dump the disassembly (disasm command) at the point of crash?
> That will show what's attempting to be accessed and what "type of
> segfault" it is.

I will, as soon as I have access to that Rpi again (next week) and post results.

> If there's an unpredictable crash, indeed it seems less likely to be
> stack overflow. I don't see any way it could be a race condition on
> musl's side but it could be one in the application/libpagekite code,
> or it could be any sort of hardware or kernel fault.

Hardware fault is unlikely - I see the same behaviour on at least two spearate RPi 3. One being in the field, which led to running in the problem at all, because it was supposed to be remote reachable by pagekite and wasn't, until we found the pagekite daemon crashes every time shortly after start. Then I tried the same locally, and got the same behaviour.

I'll investigate further and let you know. Thanks for the help so far!

>> 0xb6adfd60 - 0xb6adbbd0 = 0x4190 = 16784
>> 
>> So that child thread has put only 16k on the stack between starting and when it crashes.
> 
> That is only going to work if gdb correctly recovered the frame state
> for __clone.

I can imagine gdb starts to get confused once it traces back as far as into __clone. However, it's a back-trace, so sp at the crash is real, and unwinding starts from there. I would be very surprised if those first few stack frames displayed were wrong, in particular the sp itself. 

After all, isn't the sp the back bone of a back trace? Walking back from the current sp, and trying to make sense (frames) of what data is found on the way? But while tracing the pc or other registers might derail, there's no room for interpretation for the value of sp itself from frame to frame, or is there?

Lukas





Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.