Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1438095033.19958.7.camel@inria.fr>
Date: Tue, 28 Jul 2015 16:50:33 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: Re: What's left for 1.1.11 release?

Am Dienstag, den 28.07.2015, 10:18 -0400 schrieb Rich Felker:
> On Tue, Jul 28, 2015 at 04:09:38PM +0200, Jens Gustedt wrote:
> > Hello,
> > 
> > Am Montag, den 27.07.2015, 23:40 -0400 schrieb Rich Felker:
> > > In principle the a_store issue affects all libc-internal __lock/LOCK
> > > uses,
> > 
> > so this worries me since I assumed that UNLOCK had release consistency
> > for the __atomic implementation.
> 
> It does. The problem is that it lacks acquire consistency, which we
> need in order to know whether to wake.

ah, I think we are speaking of different things here. I want release
consistency for the lock operation, in the sense to be guaranteed that
all threads that are waiting for the lock will eventually know that it
has been released. So you are telling me, that the current version
doesn't warrant this?

The operation for which you need acquire consistency, is in fact the
load of l[1]. Somehow the current approach is ambiguous to which is
the atomic object. Is it l[0], is it l[1] or is it the pair of them?

> > > and stdio locks too, but it's only been observed in malloc.
> > > Since there don't seem to be any performance-relevant uses of a_store
> > > that don't actually need the proper barrier, I think we have to just
> > > put an explicit barrier (lock orl $0,(%esp) or mfence) after the store
> > > and live with the loss of performance.
> > 
> > How about using a xchg as instruction? This would perhaps "waste" a
> > register, but that sort of optimization should not be critical in the
> > vicinity of code that needs memory synchronization, anyhow.
> 
> How is this better? My intent was to avoid incurring a read on the
> cache line that's being written and instead achieve the
> synchronization by poking at a cache line (the stack) that should not
> be shared.

In fact, I think you need a read on the cache line, here, don't you?
You want to know the real value of l[1], no?

To be safe, I think this needs a full cmpxchg on the pair (l[0],
l[1]), otherwise you can't know if the waiter count l[1] corresponds
to the value just before the release of the lock.


Jens

-- 
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::




Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.