Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240724165917.GD10433@brightrain.aerifal.cx>
Date: Wed, 24 Jul 2024 12:59:18 -0400
From: Rich Felker <dalias@...c.org>
To: Li JinCheng <naiveli233@...look.com>
Cc: "musl@...ts.openwall.com" <musl@...ts.openwall.com>
Subject: Re: Re: 回复: RE: Maybe A Bug about timer_create and pthread_barrier_wait

On Tue, Jul 23, 2024 at 07:51:00PM -0400, Rich Felker wrote:
> On Thu, Jul 11, 2024 at 02:56:55PM +0000, Li JinCheng wrote:
> > Hi
> > >pthread_barrier_destroy() waits for all threads to be done using the
> > >barrier before destroying it.  Without it, the storage for the barrier
> > >can be deallocated when timer_create() returns while the other thread is
> > >still using it inside the pthread_barrier_wait() implementation.
> > 
> > Sorry, I have changed to use outlook to send emails. Back to the
> > original topic, I don't think add pthread_barrier_destroy in
> > timer_create can solve this bug. In timer_create, b->_b_limit should
> > be 1, and it will never meet the waiting condition (b->_b_limit < 0)
> > in pthread_barrier_destroy.
> > And I think add additional synchronization to timer_create is a
> > temporary method. Maybe some bugfix is needed in
> > pthread_barrier_wait.
> 
> There's an old unverified suspicion that our pthread barrier
> implementation has serious flaws. Unfortunately I hadn't gotten around
> to looking into it, partly because it's such an obscure sync primitive
> that almost nothing uses. We really DO need to fix it, but I don't
> want fixing timer_create to be blocked on that. Attached is a patch
> I'll probably apply (not yet tested) that just uses semaphores instead
> of a barrier. I think this will be an improvement even after barriers
> are fixed, since it will avoid pulling in the barrier code in static
> linking.
> 
> Rich

> diff --git a/src/time/timer_create.c b/src/time/timer_create.c
> index 9216b3ab..9dc0fb05 100644
> --- a/src/time/timer_create.c
> +++ b/src/time/timer_create.c
> @@ -1,6 +1,7 @@
>  #include <time.h>
>  #include <setjmp.h>
>  #include <limits.h>
> +#include <semaphore.h>
>  #include "pthread_impl.h"
>  #include "atomic.h"
>  
> @@ -12,7 +13,7 @@ struct ksigevent {
>  };
>  
>  struct start_args {
> -	pthread_barrier_t b;
> +	sem_t sem1, sem2;
>  	struct sigevent *sev;
>  };
>  
> @@ -42,7 +43,8 @@ static void *start(void *arg)
>  	void (*notify)(union sigval) = args->sev->sigev_notify_function;
>  	union sigval val = args->sev->sigev_value;
>  
> -	pthread_barrier_wait(&args->b);
> +	sem_post(&args->sem1);
> +	while (sem_wait(&args->sem2));

This code as written is wrong. Since the parent owns the object
lifetimes, the last operation in the child (timer) must be posting to
the semaphore the parent is waiting on. New draft attached.

Rich

View attachment "0001-timer_create-replace-pthread-barrier-with-semaphores.patch" of type "text/plain" (2889 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.