musl - Re: [PATCH] a new lock algorithm with lock value and CS counts in the same atomic int

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.20.13.1706181620430.21867@monopod.intra.ispras.ru>
Date: Sun, 18 Jun 2017 16:40:56 +0300 (MSK)
From: Alexander Monakov <amonakov@...ras.ru>
To: musl@...ts.openwall.com
cc: Jens Gustedt <jens.gustedt@...ia.fr>
Subject: Re: [PATCH] a new lock algorithm with lock value and CS counts
 in the same atomic int

Some style suggestions below.

On Fri, 16 Jun 2017, Jens Gustedt wrote:
> --- a/src/thread/__lock.c
> +++ b/src/thread/__lock.c
> @@ -1,15 +1,55 @@
>  #include "pthread_impl.h"
>  
> +
> +// INT_MIN for the lock, +1 for the count of this thread in the
> +// critical section, has it that the difference between locked and
> +// unlocked is ??INT_MAX.
> +#if (INT_MIN+1) != -INT_MAX
> +#error we need -INT_MAX to be INT_MIN+1
> +#endif

I suggest dropping this and spelling 'INT_MIN + current' as 'current - 1 -
INT_MAX' below.  (all you need is that INT_MIN <= -INT_MAX)

> +
>  void __lock(volatile int *l)
>  {
> -	if (libc.threads_minus_1)
> -		while (a_swap(l, 1)) __wait(l, l+1, 1, 1);
> +	/* This test is mostly useless, now. Leaving it to the first CAS is
> +	   probably just as efficient. */
> +	if (libc.threads_minus_1) {

I suggest 'if (!libc.threads_minus_1) return;' and unindenting the rest of the
function.

> +		int current = 0;
> +		/* A first spin lock acquisition loop, for the case of
> +		   no or medium congestion. */
> +		for (unsigned i = 0; i < 10; ++i) {
> +			/* This is purely speculative. */

This comment, also duplicated below, doesn't look helpful.

> +			if (current < 0) current += INT_MAX;
> +			// assertion: current >= 0
> +			int val = a_cas(l, current, current - INT_MAX);
> +			if (val == current) return;
> +			current = val;

Might be nice to use a_spin here.

> +		}
> +		// Spinning failed, so mark ourselves as being inside the CS.
> +		current = a_fetch_add(l, 1) + 1;
> +		/* The main lock acquisition loop for heavy congestion. The only
> +		   change to the value performed inside that loop is a successful
> +		   lock via the CAS that acquires the lock. */
> +		for (;;) {
> +			/* We can only go into wait, if we know that somebody holds the
> +			   lock and will eventually wake us up, again. */
> +			if (current < 0) {
> +				__syscall(SYS_futex, l, FUTEX_WAIT|FUTEX_PRIVATE, current, 0) != -ENOSYS
> +					|| __syscall(SYS_futex, l, FUTEX_WAIT, current, 0);

It's probably better to put this into a new static inline function in
pthread_impl.h next to __wake (e.g. __futexwait) and use it here and in __wait.

> +				/* This is purely speculative. */
> +				current += INT_MAX;
> +			}
> +			/* assertion: current > 0, because the count
> +			   includes us already. */
> +			int val = a_cas(l, current, INT_MIN + current);
> +			if (val == current) return;
> +			current = val;
> +		}
> +	}
>  }
>  
>  void __unlock(volatile int *l)
>  {
> -	if (l[0]) {
> -		a_store(l, 0);
> -		if (l[1]) __wake(l, 1, 1);
> +	if (a_fetch_add(l, INT_MAX) != -INT_MAX) {
> +		__syscall(SYS_futex, l, FUTEX_WAKE|FUTEX_PRIVATE, 1);

This change lost FUTEX_PRIVATE fallback handling; I don't see any reason not to
continue using __wake here.

> --- a/src/thread/pthread_detach.c
> +++ b/src/thread/pthread_detach.c
> @@ -6,7 +6,7 @@ int __pthread_join(pthread_t, void **);
>  static int __pthread_detach(pthread_t t)
>  {
>  	/* Cannot detach a thread that's already exiting */
> -	if (a_swap(t->exitlock, 1))
> +	if (a_cas(t->exitlock, 0, -INT_MAX))
>  		return __pthread_join(t, 0);
>  	t->detached = 2;
>  	__unlock(t->exitlock);

A good way to catch this would be to introduce a struct for the lock (out of
scope of this patch, of course).

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.