|
|
Message-ID: <20170618145339.GU1627@brightrain.aerifal.cx>
Date: Sun, 18 Jun 2017 10:53:39 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] a new lock algorithm with lock value and CS
counts in the same atomic int
On Sun, Jun 18, 2017 at 04:40:56PM +0300, Alexander Monakov wrote:
> Some style suggestions below.
>
> On Fri, 16 Jun 2017, Jens Gustedt wrote:
> > --- a/src/thread/__lock.c
> > +++ b/src/thread/__lock.c
> > @@ -1,15 +1,55 @@
> > #include "pthread_impl.h"
> >
> > +
> > +// INT_MIN for the lock, +1 for the count of this thread in the
> > +// critical section, has it that the difference between locked and
> > +// unlocked is ??INT_MAX.
> > +#if (INT_MIN+1) != -INT_MAX
> > +#error we need -INT_MAX to be INT_MIN+1
> > +#endif
>
> I suggest dropping this and spelling 'INT_MIN + current' as 'current - 1 -
> INT_MAX' below. (all you need is that INT_MIN <= -INT_MAX)
I think this can be dropped entirely for musl's purposes and
essentially for all practical purposes.
> > +
> > void __lock(volatile int *l)
> > {
> > - if (libc.threads_minus_1)
> > - while (a_swap(l, 1)) __wait(l, l+1, 1, 1);
> > + /* This test is mostly useless, now. Leaving it to the first CAS is
> > + probably just as efficient. */
> > + if (libc.threads_minus_1) {
>
> I suggest 'if (!libc.threads_minus_1) return;' and unindenting the rest of the
> function.
Agree.
> > + int current = 0;
> > + /* A first spin lock acquisition loop, for the case of
> > + no or medium congestion. */
> > + for (unsigned i = 0; i < 10; ++i) {
> > + /* This is purely speculative. */
>
> This comment, also duplicated below, doesn't look helpful.
>
> > + if (current < 0) current += INT_MAX;
> > + // assertion: current >= 0
> > + int val = a_cas(l, current, current - INT_MAX);
> > + if (val == current) return;
> > + current = val;
>
> Might be nice to use a_spin here.
I think so. At one point (before we used volatile to model atomics) it
was probably necessary to keep the spin loop from being optimized out
entirely; now it's just likely to be lighter on cpu/bus.
> > + }
> > + // Spinning failed, so mark ourselves as being inside the CS.
> > + current = a_fetch_add(l, 1) + 1;
> > + /* The main lock acquisition loop for heavy congestion. The only
> > + change to the value performed inside that loop is a successful
> > + lock via the CAS that acquires the lock. */
> > + for (;;) {
> > + /* We can only go into wait, if we know that somebody holds the
> > + lock and will eventually wake us up, again. */
> > + if (current < 0) {
> > + __syscall(SYS_futex, l, FUTEX_WAIT|FUTEX_PRIVATE, current, 0) != -ENOSYS
> > + || __syscall(SYS_futex, l, FUTEX_WAIT, current, 0);
>
> It's probably better to put this into a new static inline function in
> pthread_impl.h next to __wake (e.g. __futexwait) and use it here and in __wait.
Is there a reason __wait doesn't work?
> > + /* This is purely speculative. */
> > + current += INT_MAX;
> > + }
> > + /* assertion: current > 0, because the count
> > + includes us already. */
> > + int val = a_cas(l, current, INT_MIN + current);
> > + if (val == current) return;
> > + current = val;
> > + }
> > + }
> > }
> >
> > void __unlock(volatile int *l)
> > {
> > - if (l[0]) {
> > - a_store(l, 0);
> > - if (l[1]) __wake(l, 1, 1);
> > + if (a_fetch_add(l, INT_MAX) != -INT_MAX) {
> > + __syscall(SYS_futex, l, FUTEX_WAKE|FUTEX_PRIVATE, 1);
>
> This change lost FUTEX_PRIVATE fallback handling; I don't see any reason not to
> continue using __wake here.
Agreed.
> > --- a/src/thread/pthread_detach.c
> > +++ b/src/thread/pthread_detach.c
> > @@ -6,7 +6,7 @@ int __pthread_join(pthread_t, void **);
> > static int __pthread_detach(pthread_t t)
> > {
> > /* Cannot detach a thread that's already exiting */
> > - if (a_swap(t->exitlock, 1))
> > + if (a_cas(t->exitlock, 0, -INT_MAX))
> > return __pthread_join(t, 0);
> > t->detached = 2;
> > __unlock(t->exitlock);
>
> A good way to catch this would be to introduce a struct for the lock (out of
> scope of this patch, of course).
I see reasons for and against that. I'm not strongly against it but
just making a policy not to poke directly at these locks outside of
"lock implementation" files (perhaps adding a macro or inline function
to libc.h to do this instead?) seems like it would work just as well.
It was a poor choice to write the above direct a_cas to begin with, I
think.
Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.