Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1439621420.9803.20.camel@dysnomia.u-strasbg.fr>
Date: Sat, 15 Aug 2015 08:51:41 +0200
From: Jens Gustedt <Jens.Gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: [PATCH] replace a mfence instruction by an xchg instruction

according to the wisdom of the Internet, e.g

https://peeterjoot.wordpress.com/2009/12/04/intel-memory-ordering-fence-instructions-and-atomic-operations/

a mfence instruction is about 3 times slower than an xchg instruction.

Here we not only had mfence but also the mov instruction that was to be
protected by the fence. Replace all that by a native atomic instruction
that gives all the ordering guarantees that we need.

This a_store function is performance critical for the __lock
primitive. In my benchmarks to test my stdatomic implementation I have a
substantial performance increase (more than 10%), just because malloc
does better with it.
---
 arch/x32/atomic.h    | 4 ++--
 arch/x86_64/atomic.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x32/atomic.h b/arch/x32/atomic.h
index 2ab1f7a..3a2f391 100644
--- a/arch/x32/atomic.h
+++ b/arch/x32/atomic.h
@@ -81,9 +81,9 @@ static inline void a_dec(volatile int *x)
 	__asm__( "lock ; decl %0" : "=m"(*x) : "m"(*x) : "memory" );
 }
 
-static inline void a_store(volatile int *p, int x)
+static inline void a_store(volatile int *x, int v)
 {
-	__asm__( "mov %1, %0 ; mfence" : "=m"(*p) : "r"(x) : "memory" );
+	__asm__( "xchg %0, %1" : "=r"(v), "=m"(*x) : "0"(v) : "memory" );
 }
 
 static inline void a_spin()
diff --git a/arch/x86_64/atomic.h b/arch/x86_64/atomic.h
index 2ab1f7a..3a2f391 100644
--- a/arch/x86_64/atomic.h
+++ b/arch/x86_64/atomic.h
@@ -81,9 +81,9 @@ static inline void a_dec(volatile int *x)
 	__asm__( "lock ; decl %0" : "=m"(*x) : "m"(*x) : "memory" );
 }
 
-static inline void a_store(volatile int *p, int x)
+static inline void a_store(volatile int *x, int v)
 {
-	__asm__( "mov %1, %0 ; mfence" : "=m"(*p) : "r"(x) : "memory" );
+	__asm__( "xchg %0, %1" : "=r"(v), "=m"(*x) : "0"(v) : "memory" );
 }
 
 static inline void a_spin()
-- 
2.1.4

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.