|
Message-ID: <20201027211735.GV534@brightrain.aerifal.cx> Date: Tue, 27 Oct 2020 17:17:35 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Status report and MT fork On Sun, Oct 25, 2020 at 08:59:20PM -0400, Rich Felker wrote: > On Sun, Oct 25, 2020 at 08:50:29PM -0400, Rich Felker wrote: > > I just pushed a series of changes (up through 0b87551bdf) I've had > > queued for a while now, some of which had minor issues that I think > > have all been resolved now. They cover a range of bugs found in the > > process of reviewing the possibility of making fork provide a > > consistent execution environment for the child of a multithreaded > > parent, and a couple unrelated fixes. > > > > Based on distro experience with musl 1.2.1, I'm working on getting the > > improved fork into 1.2.2. Despite the fact that 1.2.1 did not break > > anything that wasn't already broken (apps invoking UB in MT-forked > > children), prior to it most of the active breakage was hit with very > > low probability, so there were a lot of packages people *thought* were > > working, that weren't, and feedback from distros seems to be that > > getting everything working as reliably as before (even if it was > > imperfect and dangerous before) is not tractable in any reasonable > > time frame. And in particular, I'm concerned about language runtimes > > like Ruby that seem to have a contract with applications they host to > > support MT-forked children. Fixing these is not a matter of fixing a > > finite set of bugs but fixing a contract, which is likely not > > tractable. > > > > Assuming it goes through, the change here will be far more complete > > than glibc's handling of MT-forked children, where most things other > > than malloc don't actually work, but fail sufficiently infrequently > > that they seem to work. While there are a lot of things I dislike > > about this path, one major thing I do like is that it really makes > > internal use of threads by library code (including third party libs) > > transparent to the application, rather than "transparent, until you > > use fork". > > > > Will follow up with draft patch for testing. > > Patch attached. It should suffice for testing and review of whether > there are any locks/state I overlooked. It could possibly be made less > ugly too. > > Note that this does not strictly conform to past and current POSIX > that specify fork as AS-safe. POSIX-future has removed fork from the > AS-safe list, and seems to have considered its original inclusion > erroneous due to pthread_atfork and existing implementation practice. > The patch as written takes care to skip all locking in single-threaded > parents, so it does not break AS-safety property in single-threaded > programs that may have made use of it. -Dfork=_Fork can also be used > to get an AS-safe fork, but it's not equivalent to old semantics; > _Fork does not run atfork handlers. It's also possible to static-link > an alternate fork implementation that provides its own pthread_atfork > and calls _Fork, if really needed for a particular application. > > Feedback from distro folks would be very helpful -- does this fix all > the packages that 1.2.1 "broke"? Another bug: > diff --git a/src/process/fork.c b/src/process/fork.c > index a12da01a..ecf7f376 100644 > --- a/src/process/fork.c > +++ b/src/process/fork.c > @@ -1,13 +1,81 @@ > #include <unistd.h> > #include "libc.h" > +#include "lock.h" > +#include "pthread_impl.h" > +#include "fork_impl.h" > + > +static volatile int *const dummy_lockptr = 0; > + > +weak_alias(dummy_lockptr, __at_quick_exit_lockptr); > +weak_alias(dummy_lockptr, __atexit_lockptr); > +weak_alias(dummy_lockptr, __dlerror_lockptr); > +weak_alias(dummy_lockptr, __gettext_lockptr); > +weak_alias(dummy_lockptr, __random_lockptr); > +weak_alias(dummy_lockptr, __sem_open_lockptr); > +weak_alias(dummy_lockptr, __stdio_ofl_lockptr); > +weak_alias(dummy_lockptr, __syslog_lockptr); > +weak_alias(dummy_lockptr, __timezone_lockptr); > +weak_alias(dummy_lockptr, __bump_lockptr); > + > +weak_alias(dummy_lockptr, __vmlock_lockptr); > + > +static volatile int *const *const atfork_locks[] = { > + &__at_quick_exit_lockptr, > + &__atexit_lockptr, > + &__dlerror_lockptr, > + &__gettext_lockptr, > + &__random_lockptr, > + &__sem_open_lockptr, > + &__stdio_ofl_lockptr, > + &__syslog_lockptr, > + &__timezone_lockptr, > + &__bump_lockptr, > +}; > > static void dummy(int x) { } > weak_alias(dummy, __fork_handler); > +weak_alias(dummy, __malloc_atfork); > +weak_alias(dummy, __ldso_atfork); > + > +static void dummy_0(void) { } > +weak_alias(dummy_0, __tl_lock); > +weak_alias(dummy_0, __tl_unlock); > > pid_t fork(void) > { > + sigset_t set; > __fork_handler(-1); > + __block_app_sigs(&set); > + int need_locks = libc.need_locks > 0; > + if (need_locks) { > + __ldso_atfork(-1); > + __inhibit_ptc(); > + for (int i=0; i<sizeof atfork_locks/sizeof *atfork_locks; i++) > + if (atfork_locks[i]) LOCK(*atfork_locks[i]); ^^^^^^^^^^^^^^^ Always non-null because it's missing a level of indirection; causes static linked program to crash. Should be if (*atfork_locks[i]). > + __malloc_atfork(-1); > + __tl_lock(); > + } > + pthread_t self=__pthread_self(), next=self->next; > pid_t ret = _Fork(); > + if (need_locks) { > + if (!ret) { > + for (pthread_t td=next; td!=self; td=td->next) > + td->tid = -1; > + if (__vmlock_lockptr) { > + __vmlock_lockptr[0] = 0; > + __vmlock_lockptr[1] = 0; > + } > + } > + __tl_unlock(); > + __malloc_atfork(!ret); > + for (int i=0; i<sizeof atfork_locks/sizeof *atfork_locks; i++) > + if (atfork_locks[i]) ^^^^^^^^^^^^^^^ And same here. > + if (ret) UNLOCK(*atfork_locks[i]); > + else **atfork_locks[i] = 0; > + __release_ptc(); > + __ldso_atfork(!ret); > + } > + __restore_sigs(&set); > __fork_handler(!ret); > return ret; > }
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.