Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6e3f3b8-8659-2c47-80db-a83606636db5@gmail.com>
Date: Fri, 1 Sep 2017 00:26:05 +0200
From: Jörg Mische <joerg.mische@...il.com>
To: musl@...ts.openwall.com
Subject: simplification of __aeabi_read_tp

Hi,

I am trying to adapt the ARM assembler parts to ARMv6-M (the thumb2 
subset of the Cortex-M0) without breaking ARMv4T compatibility. One 
issue is the function __aeabi_read_tp(), which may not clobber any 
registers except the return value in r0.

Register saving code that avoids "pop {lr}" (which is not supported by 
ARMv6-M) and "pop {pc}" (which is not supported by ARMv4T) is very ugly, 
therefore I took a closer look at its internals and discovered the 
following:

__aeabi_read_tp() calls __aeabi_read_tp_c() which inlines the function 
__pthread_self(). With ARMv7 and above, __pthread_self() simply reads 
the coprocessor register c13 without clobbering any registers. Below 
ARMv7, the function pointer __a_gettp_ptr is called. __a_gettp_ptr 
either points to __a_gettp_cp15() (a routine that reads c13) or to the 
kuser_get_tls function provided by the kernel.

The interesting point is that neither __a_gettp_cp15() (only one 
instruction and a return) nor kuser_get_tls (according to the kernel 
spec) clobber any registers. The only reason for saving the registers is 
the indirection via the C-function __aeabi_read_tp_c(), where the 
compiler is allowed to clobber r0-r3.

Since inline functions cannot be called from assembler and any C code 
must be avoided, I rewrote the code of __pthread_self() directly in 
assembler in __aeabi_read_tp.S. With these modifications the binary code 
is not only faster, it also works on the Cortex-M0 processor.

Best regards,
Jörg

---
  src/thread/arm/__aeabi_read_tp.S   | 22 ++++++++++++++++++++++
  src/thread/arm/__aeabi_read_tp.s   |  8 --------
  src/thread/arm/__aeabi_read_tp_c.c |  8 --------
  3 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/src/thread/arm/__aeabi_read_tp.S 
b/src/thread/arm/__aeabi_read_tp.S
new file mode 100644
index 0000000..897b4f8
--- /dev/null
+++ b/src/thread/arm/__aeabi_read_tp.S
@@ -0,0 +1,22 @@
+.syntax unified
+.global __a_gettp_ptr
+.hidden __a_gettp_ptr
+.global __aeabi_read_tp
+.type __aeabi_read_tp,%function
+__aeabi_read_tp:
+
+#if ((__ARM_ARCH_6K__ || __ARM_ARCH_6ZK__) && !__thumb__) || 
__ARM_ARCH_7A__ || __ARM_ARCH_7R__ || __ARM_ARCH >= 7
+
+	mrc p15,0,r0,c13,c0,3
+	bx lr
+
+#else
+
+	ldr r0,2f
+	add r0,r0,pc
+	ldr r0,[r0]
+1:	bx r0
+	.align 2
+2:	.word __a_gettp_ptr-1b
+
+#endif
diff --git a/src/thread/arm/__aeabi_read_tp.s 
b/src/thread/arm/__aeabi_read_tp.s
deleted file mode 100644
index 9d0cd31..0000000
--- a/src/thread/arm/__aeabi_read_tp.s
+++ /dev/null
@@ -1,8 +0,0 @@
-.syntax unified
-.global __aeabi_read_tp
-.type __aeabi_read_tp,%function
-__aeabi_read_tp:
-	push {r1,r2,r3,lr}
-	bl __aeabi_read_tp_c
-	pop {r1,r2,r3,lr}
-	bx lr
diff --git a/src/thread/arm/__aeabi_read_tp_c.c 
b/src/thread/arm/__aeabi_read_tp_c.c
deleted file mode 100644
index 654bdc5..0000000
--- a/src/thread/arm/__aeabi_read_tp_c.c
+++ /dev/null
@@ -1,8 +0,0 @@
-#include "pthread_impl.h"
-#include <stdint.h>
-
-__attribute__((__visibility__("hidden")))
-void *__aeabi_read_tp_c(void)
-{
-	return (void *)((uintptr_t)__pthread_self()-8+sizeof(struct pthread));
-}

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.