Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150520063631.GT17573@brightrain.aerifal.cx>
Date: Wed, 20 May 2015 02:36:32 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Refactoring atomics as llsc?

On Wed, May 20, 2015 at 08:33:23AM +0300, Timo Teras wrote:
> On Wed, 20 May 2015 01:11:08 -0400
> Rich Felker <dalias@...c.org> wrote:
> 
> > Of course the big outlier is x86, which is not llsc based but has
> > actual atomic primitives at the instruction level. If we defined the
> > sc() primitive to take 3 args instead of 2 (address, old value from
> > ll, new value to conditionally store; most archs would ignore the old
> > value argument) then we could model x86 with ll being a plain load and
> > sc being cmpxchg to allow any new custom primitives to work using
> > cmpxchg. Then we would just continue providing custom versions of all
> > the old a_* ops (a_cas, a_fetch_add, a_inc, a_dec, a_and, a_or,
> > a_swap) to take advantage of the x86 instructions. These versions
> > could probably be shared by all x86 variants (i386, x86_64, x32) since
> > they're operating on 32-bit values and the asm should be the same.
> 
> I wonder if calling that kind of emulation ll()/sc() would be
> misleading. load-linked store-conditional has stronger guarantees. sc
> will fail if the cache-line was invalidated in-between, thread was
> pre-empted etc.
> 
> Using cmpxchg can be used to emulate it only when the user is aware of
> ABA problem (some other thread may have changed the value behind us
> multiple times). Such emulation is of course ok for a_fetch_add, etc.
> But one needs to be more careful if using pointers (and trying to make
> sure the same pointer was not first removed and later re-added).
> 
> And if you want to optimize the above mentioned cases, one really needs
> to know if it's true ll+sc, or write the synchronization differently.
> In these cases the algorithm is often implemented twice with the
> different available atomics.

And yes, an alternative would be not to provide fake ll/sc for archs
without it but instead to have the existing generic cas-based
implementations to be used when ll/sc is not available. Then we'd have
2 generic implementations of everything instead of just one, but it
would probably be cleaner.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.