|
Message-ID: <4408deeb62fe668bf720d3c6c8bedda2@ispras.ru> Date: Sat, 11 Feb 2023 23:14:33 +0300 From: Alexey Izbyshev <izbyshev@...ras.ru> To: musl@...ts.openwall.com Subject: Re: [PATCH] mq_notify: fix close/recv race on failure path On 2023-02-11 22:49, Rich Felker wrote: > On Sat, Feb 11, 2023 at 10:28:20PM +0300, Alexey Izbyshev wrote: >> On 2023-02-11 21:35, Rich Felker wrote: >> >On Sat, Feb 11, 2023 at 09:08:53PM +0300, Alexey Izbyshev wrote: >> >>On 2023-02-11 20:59, Rich Felker wrote: >> >>>On Sat, Feb 11, 2023 at 08:50:15PM +0300, Alexey Izbyshev wrote: >> >>>>On 2023-02-11 20:13, Markus Wichmann wrote: >> >>>>>On Sat, Feb 11, 2023 at 10:06:03AM -0500, Rich Felker wrote: >> >>>>>>--- a/src/thread/pthread_detach.c >> >>>>>>+++ b/src/thread/pthread_detach.c >> >>>>>>@@ -5,8 +5,12 @@ static int __pthread_detach(pthread_t t) >> >>>>>> { >> >>>>>> /* If the cas fails, detach state is either already-detached >> >>>>>> * or exiting/exited, and pthread_join will trap or cleanup. */ >> >>>>>>- if (a_cas(&t->detach_state, DT_JOINABLE, DT_DETACHED) != >> >>>>>>DT_JOINABLE) >> >>>>>>+ if (a_cas(&t->detach_state, DT_JOINABLE, DT_DETACHED) != >> >>>>>>DT_JOINABLE) { >> >>>>>>+ int cs; >> >>>>>>+ __pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cs); >> >>>>>> return __pthread_join(t, 0); >> >>>>> ^^^^^^ I think you forgot to rework this. >> >>>>>>+ __pthread_setcancelstate(cs, 0); >> >>>>>>+ } >> >>>>>> return 0; >> >>>>>> } >> >>>>>> >> >>>>> >> >>>>>I see no other obvious missteps, though. >> >>>>> >> >>>>Same here, apart from this and misspelled "pthred_detach" in the >> >>>>commit message, the patches look good to me. >> >>>> >> >>>>Regarding the POSIX requirement to run sigev_notify_function in the >> >>>>context of a detached thread, while it's possible to observe the >> >>>>wrong detachstate for a short while via pthread_getattr_np after >> >>>>these patches, I'm not sure there is a standard way to do that. Even >> >>>>if it exists, this minor issue may be not worth caring about. >> >>> >> >>>Would this just be if the notification callback executes before >> >>>mq_notify returns in the parent? >> >> >> >>Yes, it seems so. >> >> >> >>>I suppose we could have the newly >> >>>created thread do the work of making the syscall, handling the error >> >>>case, detaching itself on success and and reporting back to the >> >>>mq_notify function whether it succeeded or failed via the >> >>>semaphore/args structure. Thoughts on that? >> >>> >> >>Could we just move pthread_detach call to the worker thread to the >> >>point after pthread_cleanup_pop? >> > >> >I thought that sounded dubious, in that it might lead to an attempt to >> >join a detached thread, but maybe it's safe to assume recv will never >> >return if the mq_notify syscall failed...? >> > >> Actually, because app signals are not blocked when the worker thread >> is created, recv can indeed return early with EINTR. But this looks >> like just a bug. > > Yes. While it's not a conformance bug to run with signals unblocked > ("The signal mask of this thread is implementation-defined.") it's a > functional bug to ever introduce threads that don't block all > application signals, since these interfere with sigwait & other > application control of where signals are delivered. This is an > oversight. I'll make it mask all signals. > >> Otherwise, mq_notify already assumes that recv can't return before >> SYS_mq_notify (if it did, the syscall would try to register a closed >> fd). I haven't tried to prove it (e.g. maybe recv may need to >> allocate something before blocking and hence can fail with ENOMEM?), >> but if it's true, I don't see how a failed SYS_mq_notify could cause >> recv to return, so joining a detached thread should be impossible if >> we make pthread_detach follow recv. > > I'm thinking for now maybe we should just drop the joining on error, > and leave it starting out detached. While recv should not fail, it's > obviously possible to make it fail in a seccomp sandbox, and you don't > want that to turn into UB inside the implementation. If it does fail, > the thread should still exit, but we have no way to synchronize with > the mq_notify parent to decide whether it's being joined or not in > this case without extra sync machinery... > By dropping pthread_join we'd avoid introducing a new UB case if recv fails unexpectedly, but the existing case that I mentioned (SYS_mq_notify trying to register a closed fd) would remain. It seems to me that moving SYS_mq_notify into the worker thread as you suggested earlier is the cleanest option if we're worrying about recv. Alexey
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.