Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150815201755.GL31018@brightrain.aerifal.cx>
Date: Sat, 15 Aug 2015 16:17:55 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] replace a mfence instruction by an xchg
 instruction

On Sat, Aug 15, 2015 at 08:51:41AM +0200, Jens Gustedt wrote:
> according to the wisdom of the Internet, e.g
> 
> https://peeterjoot.wordpress.com/2009/12/04/intel-memory-ordering-fence-instructions-and-atomic-operations/
> 
> a mfence instruction is about 3 times slower than an xchg instruction.

Uhg, then why does this instruction even exist if it does less and
does it slower?

> Here we not only had mfence but also the mov instruction that was to be
> protected by the fence. Replace all that by a native atomic instruction
> that gives all the ordering guarantees that we need.
> 
> This a_store function is performance critical for the __lock
> primitive. In my benchmarks to test my stdatomic implementation I have a
> substantial performance increase (more than 10%), just because malloc
> does better with it.

Is there a reason you're not using the same approach as on i386? It
was faster than xchg for me, and in principle it "should be faster".

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.