|
Message-Id: <202402170323.WAA04412@Stone.Rodents-Montreal.ORG> Date: Fri, 16 Feb 2024 22:23:04 -0500 (EST) From: Mouse <mouse@...ents-Montreal.ORG> To: toybox@...ts.landley.net, musl@...ts.openwall.com Subject: Re: [Toybox] Not sure how to debug this one. > While grinding away at release prep, I hit a WEIRD one. The > qemu-system-sh4 target got broken [...by...] the commit that changed > the stdout buffering type. > The actual _problem_ is that sigsetjmp() is faulting [...] [...] > While debugging I made the problem GO AWAY more than once by sticking > printfs() and similar into the code, [...] This smells to me like depending on uninitialized stack trash. > Not siglongjmp, _sigsetjmp_. Which means it's failing somewhere in: > > https://git.musl-libc.org/cgit/musl/tree/src/signal/sh/sigsetjmp.s > And I dunno how to stick a printf into superh assembly code. The simple way to figure that out is to compile something that uses printf and look at the assembly, either by using -save-temps or equivalent or by disassembling the binary. But, given what sigsetjmp is, sticking a printf in there is likely to be more difficult than usual. I know a little about Super-H from some Dreamcast hackery I did a while back. I had a look at the .s file you cite - thank you, musl-libc.org, for resisting the stampede to try to ram HTTPS down everyone's throat[%]! - and, while I can read it, there is too much I don't know to really claim to understand it. I can convert the assembly into English, certainly, but I don't know how much that would help (especially since it's the machine language, not assembly language, I know; the SH assembler I've used is my own, with its own syntax, so I'm having to guess at the meaning of some parts). [%] Having HTTP support meant I could just look at the http: version instead of needing to wait until I could use a work machine. > (The problem with trying to configure the kernel to produce core > dumps and compare against the readelf -d output is it's running as > PID 1. [...]) Why is that a problem? I don't see any statement of what kernel you're running under, but I can think of two plausible reasons offhand: (1) the kernel refuses to coredump PID 1 under any circumstances or (2) there's no writable filesytem to take a coredump on at that point. To address (1), I'd just build a kernel with that test diked out. To address (2), I'd normally netboot. It that's not feasible for some reason, I'd probably hack on the kernel to remount / read-write before starting userland. Of course, you said qemu-something, so you are presumably running under emulation. In principle, you could figure this out from emulator traces, but that is likely to be both extremely difficult and extremely tedious. But - you said memset-to-zero on the struct ran but didn't stop it from failing. I'd try memset to various other values, to see if you can find one that makes it stop crashing. If so, maybe the do two runs, one with it memset to one value and one with it set to another, take instruction traces, and see where they differ...? > It would be really nice if somebody who understood the assembly could > spot something... Well, as I said, I can read it, mostly, but I don't know enough of the context to know whether it's right or not. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@...ents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.