|
Message-ID: <20240710191142.85448-1-tavianator@tavianator.com> Date: Wed, 10 Jul 2024 15:11:42 -0400 From: Tavian Barnes <tavianator@...ianator.com> To: musl@...ts.openwall.com Cc: 250200715@...com Subject: RE: Maybe A Bug about timer_create and pthread_barrier_wait (I cleaned up the HTML entities in this email, but please use plain text mode next time) > Hello: > > I had a low-probability crash in the child thread when using the > timer_create interface. After debug, I found that the crash occured > when the sub-thread accessed in code "if (b->_b_waiters)" which is a > stack variable created in the main thread and passed to child thread > by args. It looks like the main thread's timer_create has finished > executing at this point, so the variables (start_args) on the stack > have been cleaned up. I take a look at the pthread_barrier_wait code > and I think it should be a scheduling problem in pthread_barrier_wait. > > Take the timer_create as an example, when the child thread is the > first thread for "pthread_barrier_wait" and it is suspened after it > executes the code "a_store(&b->_b_lock, 0)", then the main thread in > timer_create will arrive as the last thread, it will nerver wait for > the child thread to be rescheduled, the main thread can pass the > barrier and continue execution, the args created in timer_create will > be cleaned up. when the child thread is finally rescheduled, it access > the "b->_b_waiters" which has already been cleaned up by main thread > and the crash will occur. > > Is there a bug here? Looking forward to your reply. > > /* First thread to enter the barrier becomes the "instance owner" */ > if (!inst) { > struct instance new_inst = { 0 }; > int spins = 200; > b->_b_inst = inst = &new_inst; > a_store(&b->_b_lock, 0); > if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); // crash here b->_b_waiters > while (spins-- && !inst->finished) > > /* First thread to enter the barrier becomes the "instance owner" */ > if (!inst) { > struct instance new_inst = { 0 }; > int spins = 200; > b->_b_inst = inst = &new_inst; > a_store(&b->_b_lock, 0); > // when the child thread is the first thread and is scheduled out here > > if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); > while (spins-- && !inst->finished) This looks like a real bug in timer_create() to me. Here's an untested fix: From: Tavian Barnes <tavianator@...ianator.com> Date: Wed, 10 Jul 2024 14:27:21 -0400 Subject: [PATCH] timer_create: destroy the barrier before returning pthread_barrier_destroy() waits for all threads to be done using the barrier before destroying it. Without it, the storage for the barrier can be deallocated when timer_create() returns while the other thread is still using it inside the pthread_barrier_wait() implementation. Link: https://www.openwall.com/lists/musl/2024/07/08/1 --- src/time/timer_create.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/time/timer_create.c b/src/time/timer_create.c index 9216b3ab..42c69848 100644 --- a/src/time/timer_create.c +++ b/src/time/timer_create.c @@ -121,6 +121,7 @@ int timer_create(clockid_t clk, struct sigevent *restrict evp, timer_t *restrict } td->timer_id = timerid; pthread_barrier_wait(&args.b); + pthread_barrier_destroy(&args.b); if (timerid < 0) return -1; *res = (void *)(INTPTR_MIN | (uintptr_t)td>>1); break;
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.