|
Message-ID: <ed0d1c123249e9ed8bc82b0b513b82cf@ispras.ru> Date: Wed, 05 Oct 2022 15:10:09 +0300 From: Alexey Izbyshev <izbyshev@...ras.ru> To: musl@...ts.openwall.com Subject: Re: Illegal killlock skipping when transitioning to single-threaded state On 2022-10-05 04:00, Rich Felker wrote: > On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote: >> Reordering the "libc.need_locks = -1" assignment and >> UNLOCK(E->killlock) and providing a store barrier between them >> should fix the issue. > > Back to this, because it's immediately actionable without resolving > the aarch64 atomics issue: > > Do you have something in mind for how this reordering is done, since > there are other intervening steps that are potentially ordered with > respect to either or both? I don't think there is actually any > ordering constraint at all on the unlocking of killlock (with the > accompanying assignment self->tid=0 kept with it) except that it be > past the point where we are committed to the thread terminating > without executing any more application code. So my leaning would be to > move this block from the end of pthread_exit up to right after the > point-of-no-return comment. > This was my conclusion as well back when I looked at it before sending the report. I was initially concerned about whether reordering with a_store(&self->detach_state, DT_EXITED) could cause an unwanted observable change (pthread_tryjoin_np() returning EBUSY after a pthread function acting on tid like pthread_getschedparam() returns ESRCH), but no, pthread_tryjoin_np() will block/trap if the thread is not DT_JOINABLE. > Unfortunately while reading this I found another bug, this time a lock > order one. __dl_thread_cleanup() takes a lock while the thread list > lock is already held, but fork takes these in the opposite order. I > think the lock here could be dropped and replaced with an atomic-cas > list head, but that's rather messy and I'm open to other ideas. > I'm not sure why using a lock-free list is messy, it seems like a perfect fit here to me. However, doesn't __dl_vseterr() use the libc-internal allocator after 34952fe5de44a833370cbe87b63fb8eec61466d7? If so, the problem that freebuf_queue was originally solving doesn't exist anymore. We still can't call the allocator after __tl_lock(), but maybe this whole free deferral approach can be reconsidered? Alexey
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.