|
Message-ID: <b6e3f3b8-8659-2c47-80db-a83606636db5@gmail.com> Date: Fri, 1 Sep 2017 00:26:05 +0200 From: Jörg Mische <joerg.mische@...il.com> To: musl@...ts.openwall.com Subject: simplification of __aeabi_read_tp Hi, I am trying to adapt the ARM assembler parts to ARMv6-M (the thumb2 subset of the Cortex-M0) without breaking ARMv4T compatibility. One issue is the function __aeabi_read_tp(), which may not clobber any registers except the return value in r0. Register saving code that avoids "pop {lr}" (which is not supported by ARMv6-M) and "pop {pc}" (which is not supported by ARMv4T) is very ugly, therefore I took a closer look at its internals and discovered the following: __aeabi_read_tp() calls __aeabi_read_tp_c() which inlines the function __pthread_self(). With ARMv7 and above, __pthread_self() simply reads the coprocessor register c13 without clobbering any registers. Below ARMv7, the function pointer __a_gettp_ptr is called. __a_gettp_ptr either points to __a_gettp_cp15() (a routine that reads c13) or to the kuser_get_tls function provided by the kernel. The interesting point is that neither __a_gettp_cp15() (only one instruction and a return) nor kuser_get_tls (according to the kernel spec) clobber any registers. The only reason for saving the registers is the indirection via the C-function __aeabi_read_tp_c(), where the compiler is allowed to clobber r0-r3. Since inline functions cannot be called from assembler and any C code must be avoided, I rewrote the code of __pthread_self() directly in assembler in __aeabi_read_tp.S. With these modifications the binary code is not only faster, it also works on the Cortex-M0 processor. Best regards, Jörg --- src/thread/arm/__aeabi_read_tp.S | 22 ++++++++++++++++++++++ src/thread/arm/__aeabi_read_tp.s | 8 -------- src/thread/arm/__aeabi_read_tp_c.c | 8 -------- 3 files changed, 22 insertions(+), 16 deletions(-) diff --git a/src/thread/arm/__aeabi_read_tp.S b/src/thread/arm/__aeabi_read_tp.S new file mode 100644 index 0000000..897b4f8 --- /dev/null +++ b/src/thread/arm/__aeabi_read_tp.S @@ -0,0 +1,22 @@ +.syntax unified +.global __a_gettp_ptr +.hidden __a_gettp_ptr +.global __aeabi_read_tp +.type __aeabi_read_tp,%function +__aeabi_read_tp: + +#if ((__ARM_ARCH_6K__ || __ARM_ARCH_6ZK__) && !__thumb__) || __ARM_ARCH_7A__ || __ARM_ARCH_7R__ || __ARM_ARCH >= 7 + + mrc p15,0,r0,c13,c0,3 + bx lr + +#else + + ldr r0,2f + add r0,r0,pc + ldr r0,[r0] +1: bx r0 + .align 2 +2: .word __a_gettp_ptr-1b + +#endif diff --git a/src/thread/arm/__aeabi_read_tp.s b/src/thread/arm/__aeabi_read_tp.s deleted file mode 100644 index 9d0cd31..0000000 --- a/src/thread/arm/__aeabi_read_tp.s +++ /dev/null @@ -1,8 +0,0 @@ -.syntax unified -.global __aeabi_read_tp -.type __aeabi_read_tp,%function -__aeabi_read_tp: - push {r1,r2,r3,lr} - bl __aeabi_read_tp_c - pop {r1,r2,r3,lr} - bx lr diff --git a/src/thread/arm/__aeabi_read_tp_c.c b/src/thread/arm/__aeabi_read_tp_c.c deleted file mode 100644 index 654bdc5..0000000 --- a/src/thread/arm/__aeabi_read_tp_c.c +++ /dev/null @@ -1,8 +0,0 @@ -#include "pthread_impl.h" -#include <stdint.h> - -__attribute__((__visibility__("hidden"))) -void *__aeabi_read_tp_c(void) -{ - return (void *)((uintptr_t)__pthread_self()-8+sizeof(struct pthread)); -}
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.