|
Message-ID: <20170618145339.GU1627@brightrain.aerifal.cx> Date: Sun, 18 Jun 2017 10:53:39 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [PATCH] a new lock algorithm with lock value and CS counts in the same atomic int On Sun, Jun 18, 2017 at 04:40:56PM +0300, Alexander Monakov wrote: > Some style suggestions below. > > On Fri, 16 Jun 2017, Jens Gustedt wrote: > > --- a/src/thread/__lock.c > > +++ b/src/thread/__lock.c > > @@ -1,15 +1,55 @@ > > #include "pthread_impl.h" > > > > + > > +// INT_MIN for the lock, +1 for the count of this thread in the > > +// critical section, has it that the difference between locked and > > +// unlocked is ??INT_MAX. > > +#if (INT_MIN+1) != -INT_MAX > > +#error we need -INT_MAX to be INT_MIN+1 > > +#endif > > I suggest dropping this and spelling 'INT_MIN + current' as 'current - 1 - > INT_MAX' below. (all you need is that INT_MIN <= -INT_MAX) I think this can be dropped entirely for musl's purposes and essentially for all practical purposes. > > + > > void __lock(volatile int *l) > > { > > - if (libc.threads_minus_1) > > - while (a_swap(l, 1)) __wait(l, l+1, 1, 1); > > + /* This test is mostly useless, now. Leaving it to the first CAS is > > + probably just as efficient. */ > > + if (libc.threads_minus_1) { > > I suggest 'if (!libc.threads_minus_1) return;' and unindenting the rest of the > function. Agree. > > + int current = 0; > > + /* A first spin lock acquisition loop, for the case of > > + no or medium congestion. */ > > + for (unsigned i = 0; i < 10; ++i) { > > + /* This is purely speculative. */ > > This comment, also duplicated below, doesn't look helpful. > > > + if (current < 0) current += INT_MAX; > > + // assertion: current >= 0 > > + int val = a_cas(l, current, current - INT_MAX); > > + if (val == current) return; > > + current = val; > > Might be nice to use a_spin here. I think so. At one point (before we used volatile to model atomics) it was probably necessary to keep the spin loop from being optimized out entirely; now it's just likely to be lighter on cpu/bus. > > + } > > + // Spinning failed, so mark ourselves as being inside the CS. > > + current = a_fetch_add(l, 1) + 1; > > + /* The main lock acquisition loop for heavy congestion. The only > > + change to the value performed inside that loop is a successful > > + lock via the CAS that acquires the lock. */ > > + for (;;) { > > + /* We can only go into wait, if we know that somebody holds the > > + lock and will eventually wake us up, again. */ > > + if (current < 0) { > > + __syscall(SYS_futex, l, FUTEX_WAIT|FUTEX_PRIVATE, current, 0) != -ENOSYS > > + || __syscall(SYS_futex, l, FUTEX_WAIT, current, 0); > > It's probably better to put this into a new static inline function in > pthread_impl.h next to __wake (e.g. __futexwait) and use it here and in __wait. Is there a reason __wait doesn't work? > > + /* This is purely speculative. */ > > + current += INT_MAX; > > + } > > + /* assertion: current > 0, because the count > > + includes us already. */ > > + int val = a_cas(l, current, INT_MIN + current); > > + if (val == current) return; > > + current = val; > > + } > > + } > > } > > > > void __unlock(volatile int *l) > > { > > - if (l[0]) { > > - a_store(l, 0); > > - if (l[1]) __wake(l, 1, 1); > > + if (a_fetch_add(l, INT_MAX) != -INT_MAX) { > > + __syscall(SYS_futex, l, FUTEX_WAKE|FUTEX_PRIVATE, 1); > > This change lost FUTEX_PRIVATE fallback handling; I don't see any reason not to > continue using __wake here. Agreed. > > --- a/src/thread/pthread_detach.c > > +++ b/src/thread/pthread_detach.c > > @@ -6,7 +6,7 @@ int __pthread_join(pthread_t, void **); > > static int __pthread_detach(pthread_t t) > > { > > /* Cannot detach a thread that's already exiting */ > > - if (a_swap(t->exitlock, 1)) > > + if (a_cas(t->exitlock, 0, -INT_MAX)) > > return __pthread_join(t, 0); > > t->detached = 2; > > __unlock(t->exitlock); > > A good way to catch this would be to introduce a struct for the lock (out of > scope of this patch, of course). I see reasons for and against that. I'm not strongly against it but just making a policy not to poke directly at these locks outside of "lock implementation" files (perhaps adding a macro or inline function to libc.h to do this instead?) seems like it would work just as well. It was a poor choice to write the above direct a_cas to begin with, I think. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.