|
Message-ID: <1439672500.9803.26.camel@inria.fr>
Date: Sat, 15 Aug 2015 23:01:40 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] replace a mfence instruction by an xchg
instruction
Am Samstag, den 15.08.2015, 16:17 -0400 schrieb Rich Felker:
> On Sat, Aug 15, 2015 at 08:51:41AM +0200, Jens Gustedt wrote:
> > according to the wisdom of the Internet, e.g
> >
> > https://peeterjoot.wordpress.com/2009/12/04/intel-memory-ordering-fence-instructions-and-atomic-operations/
> >
> > a mfence instruction is about 3 times slower than an xchg instruction.
>
> Uhg, then why does this instruction even exist if it does less and
> does it slower?
Because they do different things ?)
mfence is to synchronize all memory, xchg, at least at a first glance,
only one word.
But I also read that the relative performance of these instructions
depend a lot on the actual dice you are dealing with.
> > Here we not only had mfence but also the mov instruction that was to be
> > protected by the fence. Replace all that by a native atomic instruction
> > that gives all the ordering guarantees that we need.
> >
> > This a_store function is performance critical for the __lock
> > primitive. In my benchmarks to test my stdatomic implementation I have a
> > substantial performance increase (more than 10%), just because malloc
> > does better with it.
>
> Is there a reason you're not using the same approach as on i386? It
> was faster than xchg for me, and in principle it "should be faster".
I discovered your approach for i386 after I experimented with "xchg"
fore x86_64. I guess the "lock orl" instruction is a replacement for
"mfence" because that one is not implemented for all variants of i386?
Exactly why a "mov" followed by a read-modify-write operation to some
random address (here the stack pointer) should be faster than a
read-modify-write operation with exactly the address you want to deal
with looks weird.
I trust you that it does, but seen from outside this arch stuff
resembles more voodoo than anything else.
I'll experiment a bit with "mov" and your approach a see what I get.
Thanks
Jens
--
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536 ::
:: :::::::::::::::::::::: gsm France : +33 651400183 ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::
Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.