Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210605171216.GA7558@brightrain.aerifal.cx>
Date: Sat, 5 Jun 2021 13:12:26 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Adding PowerPC SPE support

When the soft-float ABI for PowerPC was added in 2016 (commit
5a92dd95c77cee81755f1a441ae0b71e3ae2bcdb, mail thread "[PATCH v3] Add
PowerPC soft-float support") with Freescale cpus having the
alternative SPE FPU as the main use case, I noted that we could
probably support hard float on them, but that it would involve
determining some difficult ABI constraints. I'm now revisiting adding
this support.

The Power-Arch-32 ABI supplement
https://ftp.rtems.org/pub/rtems/people/sebh/Power-Arch-32-bit-ABI-supp-1.0-Embedded.pdf
defines the ABI profiles, and indeed ATR-SPE is built on
ATR-SOFT-FLOAT. But as I noted as a concern in my emails back in 2016,
setjmp/longjmp compatibility are problematic for the same reason
they're problematic on ARM, where optional float-related parts of the
register file are "call-saved if present". This will require testing
__hwcap. The SPEFSCR (control register) is probably not relevant if
we're doing a soft-float compatible ABI (which would lack fenv, just
like on non-"hf" ARM using "softfp" mode for hard float with standaed
ARM EABI), but SPE has an additional hidden upper 32 bits for each
GPR, and the upper 32 bits are defined by the ABI spec as call-saved
if and only if the lower 32 bits were already call-saved registers:
"The volatility of all 64-bit registers is the same for the upper and
lower word."

Because these are not clobbered by instructions that just operate on
the normal low 32 bits, they don't present a problem for normal calls.
But setjmp needs to preserve the upper 32 bits too in case longjmp is
called from a context where the caller has modified them.

I just checked and uclibc actually has this wrong: it just saves the
entire 64 bits of r14-r31 in the floating point store area. However,
r1, r2, and r13 are also call-saved ("nonvolatile" in the language of
the ABI spec) and thus, strictly speaking, need to have their upper
halves saved. It's kinda doubtful that this will ever matter (I don't
think there are ABI-conforming ways to use the upper bits of the stack
pointer or thread pointer, but there might be conforming ways to use
r13) but we should probably do it right anyway.

My plan at this point is to add the optional, hwcap-based saving to
sj/lj, and enable support for SPE hard float (otherwise just removing
the configure check to ban it, and fixing a few #ifdefs). If support
for env is desired later, I think it would have to be added as a new
ABI unless we can also add soft float fenv support.

I'm also going to work on some libc-test additions to try to catch
missing sj/lj save of float state, to validate the addition and make
sure we can catch this type of thing on future archs.

If any of the above seems erroneous or like I'm missing something
helpful, please comment.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.