|
Message-ID: <1439741801.9803.35.camel@inria.fr>
Date: Sun, 16 Aug 2015 18:16:41 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] replace a mfence instruction by an xchg
instruction
Am Sonntag, den 16.08.2015, 11:58 -0400 schrieb Rich Felker:
> On Sun, Aug 16, 2015 at 05:50:21PM +0200, Jens Gustedt wrote:
> > > See page 330, http://www.intel.com/Assets/en_US/PDF/manual/253668.pdf
> > >
> > > So mfence seems to be weaker than lock-prefixed instructions in terms
> > > of the ordering it imposes (lock-prefixed instructions forbid
> > > reordering and also have a total ordering across all cores).
> >
> > Yes, it says so on page 8-26 that the fences are definitively not
> > serializing instructions.
> >
> > (But what I tried to show in my previous mail still holds, the
> > instruction latency itself plays a big part in the efficiency of these
> > instructions.)
>
> I wasn't trying to contradict anything you've said, just expressing
> the absurdity of mfence being slower than lock-prefixed instructions,
> since it's a strictly-weaker operation.
Yes, I got that :)
One argument that we neglected for the moment, is the impact on other
threads/cores. Even if such an mfence instruction may be more
expensive for the thread that issues it, it imposes less constraints
to other threads. Maybe overall this could be win?
> > I read all of that as:
> >
> > - mfence can be used to achieve acq_rel ordering
> > - none of the fences can be use to achieve seq_cst ordering
>
> By this you mean that only lock-prefixed instructions impose a total
> order across all cores?
Plus these very expensive complete serializing instructions that are
listed in the manual.
> > Wasn't the idea that all atomic.h functions implement sequential
> > consistency?
>
> Yes, that's the intent, but I don't want to introduce 'major'
> performance regressions fixing 'minor' failures to be seq_cst if
> there's no observable misbehavior in the code using them.
Misbehavior here is really hard to track down. Especially having an
application that changes behavior if it is not guaranteed seq_cst is
probably quite difficult to observe.
> Still it
> would be nice to know whether such failures still exist, and if so
> where, so we can eventually clean this up.
Replacing "mfence" by "lock ; orl $0,(%%rsp)" would provide us with
security by not compromising performance :)
Jens
--
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536 ::
:: :::::::::::::::::::::: gsm France : +33 651400183 ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::
Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.