|
Message-ID: <20230212003158.GO4163@brightrain.aerifal.cx> Date: Sat, 11 Feb 2023 19:32:01 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [PATCH] mq_notify: fix close/recv race on failure path On Sat, Feb 11, 2023 at 11:14:33PM +0300, Alexey Izbyshev wrote: > On 2023-02-11 22:49, Rich Felker wrote: > >On Sat, Feb 11, 2023 at 10:28:20PM +0300, Alexey Izbyshev wrote: > >>On 2023-02-11 21:35, Rich Felker wrote: > >>>On Sat, Feb 11, 2023 at 09:08:53PM +0300, Alexey Izbyshev wrote: > >>>>On 2023-02-11 20:59, Rich Felker wrote: > >>>>>On Sat, Feb 11, 2023 at 08:50:15PM +0300, Alexey Izbyshev wrote: > >>>>>>On 2023-02-11 20:13, Markus Wichmann wrote: > >>>>>>>On Sat, Feb 11, 2023 at 10:06:03AM -0500, Rich Felker wrote: > >>>>>>>>--- a/src/thread/pthread_detach.c > >>>>>>>>+++ b/src/thread/pthread_detach.c > >>>>>>>>@@ -5,8 +5,12 @@ static int __pthread_detach(pthread_t t) > >>>>>>>> { > >>>>>>>> /* If the cas fails, detach state is either already-detached > >>>>>>>> * or exiting/exited, and pthread_join will trap or cleanup. */ > >>>>>>>>- if (a_cas(&t->detach_state, DT_JOINABLE, DT_DETACHED) != > >>>>>>>>DT_JOINABLE) > >>>>>>>>+ if (a_cas(&t->detach_state, DT_JOINABLE, DT_DETACHED) != > >>>>>>>>DT_JOINABLE) { > >>>>>>>>+ int cs; > >>>>>>>>+ __pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cs); > >>>>>>>> return __pthread_join(t, 0); > >>>>>>> ^^^^^^ I think you forgot to rework this. > >>>>>>>>+ __pthread_setcancelstate(cs, 0); > >>>>>>>>+ } > >>>>>>>> return 0; > >>>>>>>> } > >>>>>>>> > >>>>>>> > >>>>>>>I see no other obvious missteps, though. > >>>>>>> > >>>>>>Same here, apart from this and misspelled "pthred_detach" in the > >>>>>>commit message, the patches look good to me. > >>>>>> > >>>>>>Regarding the POSIX requirement to run sigev_notify_function in the > >>>>>>context of a detached thread, while it's possible to observe the > >>>>>>wrong detachstate for a short while via pthread_getattr_np after > >>>>>>these patches, I'm not sure there is a standard way to do that. Even > >>>>>>if it exists, this minor issue may be not worth caring about. > >>>>> > >>>>>Would this just be if the notification callback executes before > >>>>>mq_notify returns in the parent? > >>>> > >>>>Yes, it seems so. > >>>> > >>>>>I suppose we could have the newly > >>>>>created thread do the work of making the syscall, handling the error > >>>>>case, detaching itself on success and and reporting back to the > >>>>>mq_notify function whether it succeeded or failed via the > >>>>>semaphore/args structure. Thoughts on that? > >>>>> > >>>>Could we just move pthread_detach call to the worker thread to the > >>>>point after pthread_cleanup_pop? > >>> > >>>I thought that sounded dubious, in that it might lead to an attempt to > >>>join a detached thread, but maybe it's safe to assume recv will never > >>>return if the mq_notify syscall failed...? > >>> > >>Actually, because app signals are not blocked when the worker thread > >>is created, recv can indeed return early with EINTR. But this looks > >>like just a bug. > > > >Yes. While it's not a conformance bug to run with signals unblocked > >("The signal mask of this thread is implementation-defined.") it's a > >functional bug to ever introduce threads that don't block all > >application signals, since these interfere with sigwait & other > >application control of where signals are delivered. This is an > >oversight. I'll make it mask all signals. > > > >>Otherwise, mq_notify already assumes that recv can't return before > >>SYS_mq_notify (if it did, the syscall would try to register a closed > >>fd). I haven't tried to prove it (e.g. maybe recv may need to > >>allocate something before blocking and hence can fail with ENOMEM?), > >>but if it's true, I don't see how a failed SYS_mq_notify could cause > >>recv to return, so joining a detached thread should be impossible if > >>we make pthread_detach follow recv. > > > >I'm thinking for now maybe we should just drop the joining on error, > >and leave it starting out detached. While recv should not fail, it's > >obviously possible to make it fail in a seccomp sandbox, and you don't > >want that to turn into UB inside the implementation. If it does fail, > >the thread should still exit, but we have no way to synchronize with > >the mq_notify parent to decide whether it's being joined or not in > >this case without extra sync machinery... > > > By dropping pthread_join we'd avoid introducing a new UB case if > recv fails unexpectedly, but the existing case that I mentioned > (SYS_mq_notify trying to register a closed fd) would remain. It > seems to me that moving SYS_mq_notify into the worker thread as you > suggested earlier is the cleanest option if we're worrying about > recv. OK, I've done and reworked this series in a way that I think addresses all the problems. Rich View attachment "0001-fix-pthread_detach-inadvertently-acting-as-cancellat.patch" of type "text/plain" (1361 bytes) View attachment "0002-mq_notify-use-semaphore-instead-of-barrier-to-sync-a.patch" of type "text/plain" (2163 bytes) View attachment "0003-mq_notify-rework-to-fix-use-after-close-double-close.patch" of type "text/plain" (2744 bytes) View attachment "0004-mq_notify-join-worker-thread-before-returning-in-err.patch" of type "text/plain" (1613 bytes) View attachment "0005-mq_notify-block-all-application-signals-in-the-worke.patch" of type "text/plain" (1869 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.