|
|
Message-ID: <b6e3f3b8-8659-2c47-80db-a83606636db5@gmail.com>
Date: Fri, 1 Sep 2017 00:26:05 +0200
From: Jörg Mische <joerg.mische@...il.com>
To: musl@...ts.openwall.com
Subject: simplification of __aeabi_read_tp
Hi,
I am trying to adapt the ARM assembler parts to ARMv6-M (the thumb2
subset of the Cortex-M0) without breaking ARMv4T compatibility. One
issue is the function __aeabi_read_tp(), which may not clobber any
registers except the return value in r0.
Register saving code that avoids "pop {lr}" (which is not supported by
ARMv6-M) and "pop {pc}" (which is not supported by ARMv4T) is very ugly,
therefore I took a closer look at its internals and discovered the
following:
__aeabi_read_tp() calls __aeabi_read_tp_c() which inlines the function
__pthread_self(). With ARMv7 and above, __pthread_self() simply reads
the coprocessor register c13 without clobbering any registers. Below
ARMv7, the function pointer __a_gettp_ptr is called. __a_gettp_ptr
either points to __a_gettp_cp15() (a routine that reads c13) or to the
kuser_get_tls function provided by the kernel.
The interesting point is that neither __a_gettp_cp15() (only one
instruction and a return) nor kuser_get_tls (according to the kernel
spec) clobber any registers. The only reason for saving the registers is
the indirection via the C-function __aeabi_read_tp_c(), where the
compiler is allowed to clobber r0-r3.
Since inline functions cannot be called from assembler and any C code
must be avoided, I rewrote the code of __pthread_self() directly in
assembler in __aeabi_read_tp.S. With these modifications the binary code
is not only faster, it also works on the Cortex-M0 processor.
Best regards,
Jörg
---
src/thread/arm/__aeabi_read_tp.S | 22 ++++++++++++++++++++++
src/thread/arm/__aeabi_read_tp.s | 8 --------
src/thread/arm/__aeabi_read_tp_c.c | 8 --------
3 files changed, 22 insertions(+), 16 deletions(-)
diff --git a/src/thread/arm/__aeabi_read_tp.S
b/src/thread/arm/__aeabi_read_tp.S
new file mode 100644
index 0000000..897b4f8
--- /dev/null
+++ b/src/thread/arm/__aeabi_read_tp.S
@@ -0,0 +1,22 @@
+.syntax unified
+.global __a_gettp_ptr
+.hidden __a_gettp_ptr
+.global __aeabi_read_tp
+.type __aeabi_read_tp,%function
+__aeabi_read_tp:
+
+#if ((__ARM_ARCH_6K__ || __ARM_ARCH_6ZK__) && !__thumb__) ||
__ARM_ARCH_7A__ || __ARM_ARCH_7R__ || __ARM_ARCH >= 7
+
+ mrc p15,0,r0,c13,c0,3
+ bx lr
+
+#else
+
+ ldr r0,2f
+ add r0,r0,pc
+ ldr r0,[r0]
+1: bx r0
+ .align 2
+2: .word __a_gettp_ptr-1b
+
+#endif
diff --git a/src/thread/arm/__aeabi_read_tp.s
b/src/thread/arm/__aeabi_read_tp.s
deleted file mode 100644
index 9d0cd31..0000000
--- a/src/thread/arm/__aeabi_read_tp.s
+++ /dev/null
@@ -1,8 +0,0 @@
-.syntax unified
-.global __aeabi_read_tp
-.type __aeabi_read_tp,%function
-__aeabi_read_tp:
- push {r1,r2,r3,lr}
- bl __aeabi_read_tp_c
- pop {r1,r2,r3,lr}
- bx lr
diff --git a/src/thread/arm/__aeabi_read_tp_c.c
b/src/thread/arm/__aeabi_read_tp_c.c
deleted file mode 100644
index 654bdc5..0000000
--- a/src/thread/arm/__aeabi_read_tp_c.c
+++ /dev/null
@@ -1,8 +0,0 @@
-#include "pthread_impl.h"
-#include <stdint.h>
-
-__attribute__((__visibility__("hidden")))
-void *__aeabi_read_tp_c(void)
-{
- return (void *)((uintptr_t)__pthread_self()-8+sizeof(struct pthread));
-}
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.