Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170807031527.GM1627@brightrain.aerifal.cx>
Date: Sun, 6 Aug 2017 23:15:27 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] ppc64: fix setjmp/longjmp handling of TOC pointer

On Fri, Aug 04, 2017 at 12:12:32AM -0500, Bobby Bingham wrote:
> The TOC pointer is constant within a single dso, but needs to be saved
> and restored around cross-dso calls.  The PLT stub saves it to the
> caller's stack frame, and the linker adds code to the caller to restore
> it.
> 
> With a local call, as within a single dso or with static linking, this
> doesn't happen and the TOC pointer is always in r2.  Therefore,
> setjmp/longjmp need to save/restore the TOC pointer from/to different
> locations depending on whether the call to setjmp was a local or non-local
> call.
> 
> It is always safe for longjmp to restore to both r2 and the caller's stack.
> If the call to setjmp was local, and only r2 matters and the stack location
> will be ignored, but is required by the ABI to be reserved for the TOC
> pointer.  If the call was non-local, then only the stack location matters,
> and whatever is restored into r2 will be clobbered anyway when the caller
> reloads r2 from the stack.

This all sounds correct, and the code looks pretty much like what I
expected it to. Thanks for explaining it well here in the text for the
commit message.

> A little extra care is required for sigsetjmp, because it uses setjmp
> internally.  After the second return from this setjmp call, r2 will contain
> the caller's TOC pointer instead of libc's TOC pointer.  We need to save
> and restore the correct libc pointer before we can tail call to
> __sigsetjmp_tail.

This looks ok too; see comments below.

> diff --git a/src/setjmp/powerpc64/longjmp.s b/src/setjmp/powerpc64/longjmp.s
> index 7f241c2d..81d45ff6 100644
> --- a/src/setjmp/powerpc64/longjmp.s
> +++ b/src/setjmp/powerpc64/longjmp.s
> @@ -10,10 +10,14 @@ longjmp:
>  	# 1) restore cr
>  	ld   0,  1*8(3)
>  	mtcr 0
> -	# 2) restore r1-r2 (SP and TOC)
> +	# 2) restore SP
>  	ld   1,  2*8(3)
> +	# 3) restore TOC into both r2 and the caller's stack.
> +	#    Which location is required depends on whether setjmp was called
> +	#    locally or non-locally, but it's always safe to restore to both.
>  	ld   2,  3*8(3)
> -	# 3) restore r14-r31
> +	std  2,   24(1)
> +	# 4) restore r14-r31
>  	ld  14,  4*8(3)
>  	ld  15,  5*8(3)
>  	ld  16,  6*8(3)
> @@ -32,7 +36,7 @@ longjmp:
>  	ld  29, 19*8(3)
>  	ld  30, 20*8(3)
>  	ld  31, 21*8(3)
> -	# 4) restore floating point registers f14-f31
> +	# 5) restore floating point registers f14-f31

Not a problem to address in this commit, but I think the body of the
patch is an argument that numbered steps are a bad idea in comments. I
might have just split 2 into 2a and 2b to avoid this but you surely
don't need to resubmit just for that; I'm mainly thinking from a
standpoint of avoiding this kind of thing in the future.

> diff --git a/src/signal/powerpc64/sigsetjmp.s b/src/signal/powerpc64/sigsetjmp.s
> index 52ac1d03..410c2831 100644
> --- a/src/signal/powerpc64/sigsetjmp.s
> +++ b/src/signal/powerpc64/sigsetjmp.s
> @@ -2,29 +2,36 @@
>  	.global __sigsetjmp
>  	.type sigsetjmp,%function
>  	.type __sigsetjmp,%function
> -	.hidden ___setjmp
> +	.hidden __setjmp_toc
>  sigsetjmp:
>  __sigsetjmp:
>  	addis 2, 12, .TOC.-__sigsetjmp@ha
>  	addi  2,  2, .TOC.-__sigsetjmp@l
> +	ld    5, 24(1)   # load from the TOC slot in the caller's stack frame
> +	b     1f
> +
>  	.localentry sigsetjmp,.-sigsetjmp
>  	.localentry __sigsetjmp,.-__sigsetjmp
> +	mr    5,  2
>  
> +1:
>  	cmpwi cr7, 4, 0
> -	beq-  cr7, ___setjmp
> +	beq-  cr7, __setjmp_toc
>  
> -	mflr  5
> -	std   5, 512(3)
> -	std  16, 512+8+8(3)
> +	mflr  6
> +	std   6, 512(3)
> +	std   2, 512+16(3)
> +	std  16, 512+24(3)
>  	mr   16, 3
>  
> -	bl ___setjmp
> +	bl __setjmp_toc
>  
>  	mr   4,  3
>  	mr   3, 16
>  	ld   5, 512(3)
>  	mtlr 5
> -	ld  16, 512+8+8(3)
> +	ld   2, 512+16(3)
> +	ld  16, 512+24(3)
>  
>  .hidden __sigsetjmp_tail
>  	b __sigsetjmp_tail
> -- 
> 2.13.3

Am I correct that you're saving the TOC value alongside where
sigsetjmp was already saving the original r16 value, 8 bytes past the
start of the sigset_t? This looks ok; there's plenty of extra space
there. Alternatively I think you could just recompute it the same way
the nonlocal prologue does, but I don't see any reason to prefer that
unless we were running out of space in the jmp_buf.

Anyway, as long as my understanding is correct, I see no reason not to
commit this patch as-is. It's well-documented and seems to solve the
problem fully. Thanks.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.