|
Message-ID: <1439740221.9803.33.camel@inria.fr>
Date: Sun, 16 Aug 2015 17:50:21 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] replace a mfence instruction by an xchg
instruction
Am Sonntag, den 16.08.2015, 11:16 -0400 schrieb Rich Felker:
> On Sun, Aug 16, 2015 at 02:42:33PM +0200, Jens Gustedt wrote:
> > Hello,
> >
> > Am Samstag, den 15.08.2015, 19:28 -0400 schrieb Rich Felker:
> > > On Sat, Aug 15, 2015 at 11:01:40PM +0200, Jens Gustedt wrote:
> > > > Am Samstag, den 15.08.2015, 16:17 -0400 schrieb Rich Felker:
> > > > > On Sat, Aug 15, 2015 at 08:51:41AM +0200, Jens Gustedt wrote:
> > > > > > according to the wisdom of the Internet, e.g
> > > > > >
> > > > > > https://peeterjoot.wordpress.com/2009/12/04/intel-memory-ordering-fence-instructions-and-atomic-operations/
> > > > > >
> > > > > > a mfence instruction is about 3 times slower than an xchg instruction.
> > > > >
> > > > > Uhg, then why does this instruction even exist if it does less and
> > > > > does it slower?
> > > >
> > > > Because they do different things ?)
> > > >
> > > > mfence is to synchronize all memory, xchg, at least at a first glance,
> > > > only one word.
> > >
> > > No, any lock-prefixed instruction, or xchg which has a builtin lock,
> > > fully orders all memory accesses. Essentially it contains a builtin
> > > mfence.
> >
> > Hm, I think mfence does a bit more than that. The three fence
> > instructions were introduced when they invented the asynchronous
> > ("non-temporal") move instructions that came with sse.
> >
> > I don't think that "lock" instructions synchronize with these
> > asynchronous moves, so the two (lock instructions and fences) are just
> > different types of animals. And this answers perhaps your question
> > up-thread, why there is actually something like mfence.
>
> The relevant text seems to be the Intel manual, Vol 3A, 8.2.2 Memory
> Ordering in P6 and More Recent Processor Families:
>
> ----------------------------------------------------------------------
> Reads are not reordered with other reads.
>
> Writes are not reordered with older reads.
>
> Writes to memory are not reordered with other writes, with the
> following exceptions:
> — writes executed with the CLFLUSH instruction;
> — streaming stores (writes) executed with the non-temporal move
> instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD); and
> — string operations (see Section 8.2.4.1).
>
> Reads may be reordered with older writes to different locations but
> not with older writes to the same location.
>
> Reads or writes cannot be reordered with I/O instructions, locked
> instructions, or serializing instructions.
>
> Reads cannot pass earlier LFENCE and MFENCE instructions.
>
> Writes cannot pass earlier LFENCE, SFENCE, and MFENCE instructions.
>
> LFENCE instructions cannot pass earlier reads.
>
> SFENCE instructions cannot pass earlier writes.
>
> MFENCE instructions cannot pass earlier reads or writes
> ----------------------------------------------------------------------
>
> See page 330, http://www.intel.com/Assets/en_US/PDF/manual/253668.pdf
>
> So mfence seems to be weaker than lock-prefixed instructions in terms
> of the ordering it imposes (lock-prefixed instructions forbid
> reordering and also have a total ordering across all cores).
Yes, it says so on page 8-26 that the fences are definitively not
serializing instructions.
(But what I tried to show in my previous mail still holds, the
instruction latency itself plays a big part in the efficiency of these
instructions.)
I read all of that as:
- mfence can be used to achieve acq_rel ordering
- none of the fences can be use to achieve seq_cst ordering
Wasn't the idea that all atomic.h functions implement sequential
consistency?
Jens
--
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536 ::
:: :::::::::::::::::::::: gsm France : +33 651400183 ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::
Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.