|
Message-ID: <1a0289c15879bef6d538c0066f58545c@ispras.ru> Date: Wed, 14 Dec 2022 09:49:26 +0300 From: Alexey Izbyshev <izbyshev@...ras.ru> To: musl@...ts.openwall.com Subject: Re: [PATCH] mq_notify: fix close/recv race on failure path On 2022-12-14 05:26, Rich Felker wrote: > On Wed, Nov 09, 2022 at 01:46:13PM +0300, Alexey Izbyshev wrote: >> In case of failure mq_notify closes the socket immediately after >> sending a cancellation request to the worker thread that is going to >> call or have already called recv on that socket. Even if we don't >> consider the kernel behavior when the only descriptor to an object >> that >> is being used in a system call is closed, if the socket descriptor is >> closed before the kernel looks at it, another thread could open a >> descriptor with the same value in the meantime, resulting in recv >> acting on a wrong object. >> >> Fix the race by moving pthread_cancel call before the barrier wait to >> guarantee that the cancellation flag is set before the worker thread >> enters recv. >> --- >> Other ways to fix this: >> >> * Remove the racing close call from mq_notify and surround recv >> with pthread_cleanup_push/pop. >> >> * Make the worker thread joinable initially, join it before closing >> the socket on the failure path, and detach it on the happy path. >> This would also require disabling cancellation around join/detach >> to ensure that mq_notify itself is not cancelled in an inappropriate >> state. > > I'd put this aside for a while because of the pthread barrier > involvement I kinda didn't want to deal with. The fix you have sounds > like it works, but I think I'd rather pursue one of the other > approaches, probably the joinable thread one. > > At present, the implementation of barriers seems to be buggy (I need > to dig back up the post about that), and they're also a really > expensive synchronization tool that goes both directions where we > really only need one direction (notifying the caller we're done > consuming the args). I'd rather switch to a semaphore, which is the > lightest and most idiomatic (at least per present-day musl idioms) way > to do this. > This sounds good to me. The same approach can also be used in timer_create (assuming it's acceptable to add dependency on pthread_cancel to that code). > Using a joinable thread also lets us ensure we don't leave around > threads that are waiting to be scheduled just to exit on failure > return. Depending on scheduling attributes, this probably could be > bad. > I also prefer this approach, though mostly for aesthetic reasons (I haven't thought about the scheduling behavior). I didn't use it only because I felt it's a "logically larger" change than simply moving the pthread_barrier_wait call. And I wasn't aware that barriers are buggy in musl. Thanks, Alexey
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.