Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221004164557.GO29905@brightrain.aerifal.cx>
Date: Tue, 4 Oct 2022 12:45:57 -0400
From: Rich Felker <dalias@...c.org>
To: James Y Knight <jyknight@...gle.com>
Cc: musl@...ts.openwall.com
Subject: Re: Illegal killlock skipping when transitioning to
 single-threaded state

On Tue, Oct 04, 2022 at 12:24:14PM -0400, James Y Knight wrote:
> On Tue, Oct 4, 2022 at 10:13 AM Rich Felker <dalias@...c.org> wrote:
> 
> > If this is actually the case, it's disturbing that GCC does not seem
> > to be getting it right either...
> >
> 
> The __sync_* builtins are legacy and were never particularly well-defined
> -- especially for non-x86 platforms. (Note that they don't include atomic
> load/store operations, which are effectively unnecessary on x86, but vital
> on most other architectures).
> 
> I would suggest that musl (and anyone else) really ought to migrate from
> its homegrown atomics support to the standard C11 atomic memory model,
> which _is_ well-defined and extensively studied. Such a migration will
> certainly be a Project, of course...

We do not use the __sync builtins. Atomics in musl are implemented
entirely in asm, because the compilers do not get theirs right and do
not support the runtime selection of methods necessary for some of the
archs we support (especially 32-bit arm and sh).

The atomics in musl implement the "POSIX memory model" which is much
simpler to understand and less error-prone than the C11 one (with the
tradeoff being that it admits a lot less optimization for
performance), and is a valid implementation choice for the C11 one. It
has only one relationship, "synchronizes memory", that all
synchronization primitives and atomics entail.

The migration that might happen at some point is using a weaker model
for the C11 synchronization primitives and possibly for the POSIX ones
in the future if POSIX adopts a weaker model. Of course we could also
do this for our own implementation-internal locks, independent of
whether POSIX makes any changes, but among those the only ones that
have any significant effect on performance are the ones in malloc.
It's likely that mallocng could benefit a lot from relaxed-order
atomics on the free bitmasks and weaker acquire/release semantics on
the malloc lock. But this would only help archs where weaker forms are
available.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.