|
Message-ID: <20150816151616.GO31018@brightrain.aerifal.cx> Date: Sun, 16 Aug 2015 11:16:17 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [PATCH] replace a mfence instruction by an xchg instruction On Sun, Aug 16, 2015 at 02:42:33PM +0200, Jens Gustedt wrote: > Hello, > > Am Samstag, den 15.08.2015, 19:28 -0400 schrieb Rich Felker: > > On Sat, Aug 15, 2015 at 11:01:40PM +0200, Jens Gustedt wrote: > > > Am Samstag, den 15.08.2015, 16:17 -0400 schrieb Rich Felker: > > > > On Sat, Aug 15, 2015 at 08:51:41AM +0200, Jens Gustedt wrote: > > > > > according to the wisdom of the Internet, e.g > > > > > > > > > > https://peeterjoot.wordpress.com/2009/12/04/intel-memory-ordering-fence-instructions-and-atomic-operations/ > > > > > > > > > > a mfence instruction is about 3 times slower than an xchg instruction. > > > > > > > > Uhg, then why does this instruction even exist if it does less and > > > > does it slower? > > > > > > Because they do different things ?) > > > > > > mfence is to synchronize all memory, xchg, at least at a first glance, > > > only one word. > > > > No, any lock-prefixed instruction, or xchg which has a builtin lock, > > fully orders all memory accesses. Essentially it contains a builtin > > mfence. > > Hm, I think mfence does a bit more than that. The three fence > instructions were introduced when they invented the asynchronous > ("non-temporal") move instructions that came with sse. > > I don't think that "lock" instructions synchronize with these > asynchronous moves, so the two (lock instructions and fences) are just > different types of animals. And this answers perhaps your question > up-thread, why there is actually something like mfence. The relevant text seems to be the Intel manual, Vol 3A, 8.2.2 Memory Ordering in P6 and More Recent Processor Families: ---------------------------------------------------------------------- Reads are not reordered with other reads. Writes are not reordered with older reads. Writes to memory are not reordered with other writes, with the following exceptions: — writes executed with the CLFLUSH instruction; — streaming stores (writes) executed with the non-temporal move instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD); and — string operations (see Section 8.2.4.1). Reads may be reordered with older writes to different locations but not with older writes to the same location. Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions. Reads cannot pass earlier LFENCE and MFENCE instructions. Writes cannot pass earlier LFENCE, SFENCE, and MFENCE instructions. LFENCE instructions cannot pass earlier reads. SFENCE instructions cannot pass earlier writes. MFENCE instructions cannot pass earlier reads or writes ---------------------------------------------------------------------- See page 330, http://www.intel.com/Assets/en_US/PDF/manual/253668.pdf So mfence seems to be weaker than lock-prefixed instructions in terms of the ordering it imposes (lock-prefixed instructions forbid reordering and also have a total ordering across all cores). Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.