Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 17 Feb 2024 20:34:28 -0500
From: Rich Felker <dalias@...c.org>
To: Valery Ushakov <uwe@...err.spb.ru>
Cc: musl@...ts.openwall.com, Rob Landley <rob@...dley.net>,
	toybox <toybox@...ts.landley.net>
Subject: Re: Re: Not sure how to debug this one.

On Sun, Feb 18, 2024 at 12:45:35AM +0300, Valery Ushakov wrote:
> On Fri, Feb 16, 2024 at 19:48:27 -0600, Rob Landley wrote:
> 
> > https://git.musl-libc.org/cgit/musl/tree/src/signal/sh/sigsetjmp.s
> 
> I haven't touched superh asm in a while and the code has zero
> comments (*ugh*), but I *think* sigsetjmp clobbers caller's r8.
> 
> r8 is callee saved.  sigsetjmp wants to use it to save its env
> argument across the call to __setjmp.  So it saves caller's r8 and
> uses r8 to save its env b/c __setjmp it's about to call will clobber
> it.  Then __setjmp saves this r8 = env in the jump buf, not caller's
> r8.  The instruction in the delay slot of the tail call to
> __sigsetjmp_tail vaguely looks like it might have been intended to
> patch it, but it loads r8, not stores it. I'm not sure why it would
> want to load r8 at that point.
> 
> Sorry, I only skimmed through the code and as I said, there're no
> comments (which for asm code is borderline criminal, IMHO :) so I
> might be completely misinterpreting what this code does...

I think you've stumbled across the bug even if you don't realize it.
:-)

The idea of the sigsetjmp implementation is explained in commit
583e55122e767b1586286a0d9c35e2a4027998ab. As you've noted, comments
would be nice, but I'm not sure where they belong since it's the same
logic on every arch.

sigsetjmp wants to save a position in itself for longjmp to return to,
but because sigsetjmp will be returning, it can't save any state on
the stack; it doesn't have a stack frame. The only storage it has
available is call-saved registers, which longjmp will necessarily
restore for it. So, it picks one of its caller's call-saved registers
(r8) and stores that inside the sigjmp_buf, then uses that register to
preserve the pointer to the sigjmp_buf to access after setjmp returns
(either the first time or the second time).

The only problem with the sh implementation is that, due to limited
displacement range for load/store instructions, we have to add 60 to
the sigjmp_buf base address to get the base register for accessing the
saved return address and r8. This takes place in a temp register r6.
However, the last line of this code path, the delay slot after the
tail call to __sigsetjmp_tail, erroneously uses r4 (the base
sigjmp_buf pointer) as the base for restoring r8:

	 mov.l @(4+8,r4), r8

This corrupts the caller's saved register file. If I'm not mistaken,
it overwrites the caller's r8 with the caller's r11.

Presumably Rob didn't actually hit a crash in sigsetjmp, but just
inferred the crash was there via printf debugging. The actual crash
must be in the caller once it gets back control with the wrong value
in r8.

This has been an area I've wanted to have testing for for a long time,
but I don't have a really good idea for how to implement the test. We
would want the compiler to generate code that puts lots of
intermediates in as many call-saved registers as possible, calls
setjmp, then calls another function that *also* puts pressure on the
register allocation and then calls longjmp. If it's okay for the test
to be arch-specific, this could just work via __asm__ register
constraints and a call to setjmp from asm, but I'm not sure if that
would be a good way to do things in libc-test...

In any case, thanks to everyone who participated in this. This is a
rather bad bug and a nice one to get fixed up in the release window!

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.