musl - Re: Status report and MT fork

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201027211735.GV534@brightrain.aerifal.cx>
Date: Tue, 27 Oct 2020 17:17:35 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Status report and MT fork

On Sun, Oct 25, 2020 at 08:59:20PM -0400, Rich Felker wrote:
> On Sun, Oct 25, 2020 at 08:50:29PM -0400, Rich Felker wrote:
> > I just pushed a series of changes (up through 0b87551bdf) I've had
> > queued for a while now, some of which had minor issues that I think
> > have all been resolved now. They cover a range of bugs found in the
> > process of reviewing the possibility of making fork provide a
> > consistent execution environment for the child of a multithreaded
> > parent, and a couple unrelated fixes.
> > 
> > Based on distro experience with musl 1.2.1, I'm working on getting the
> > improved fork into 1.2.2. Despite the fact that 1.2.1 did not break
> > anything that wasn't already broken (apps invoking UB in MT-forked
> > children), prior to it most of the active breakage was hit with very
> > low probability, so there were a lot of packages people *thought* were
> > working, that weren't, and feedback from distros seems to be that
> > getting everything working as reliably as before (even if it was
> > imperfect and dangerous before) is not tractable in any reasonable
> > time frame. And in particular, I'm concerned about language runtimes
> > like Ruby that seem to have a contract with applications they host to
> > support MT-forked children. Fixing these is not a matter of fixing a
> > finite set of bugs but fixing a contract, which is likely not
> > tractable.
> > 
> > Assuming it goes through, the change here will be far more complete
> > than glibc's handling of MT-forked children, where most things other
> > than malloc don't actually work, but fail sufficiently infrequently
> > that they seem to work. While there are a lot of things I dislike
> > about this path, one major thing I do like is that it really makes
> > internal use of threads by library code (including third party libs)
> > transparent to the application, rather than "transparent, until you
> > use fork".
> > 
> > Will follow up with draft patch for testing.
> 
> Patch attached. It should suffice for testing and review of whether
> there are any locks/state I overlooked. It could possibly be made less
> ugly too.
> 
> Note that this does not strictly conform to past and current POSIX
> that specify fork as AS-safe. POSIX-future has removed fork from the
> AS-safe list, and seems to have considered its original inclusion
> erroneous due to pthread_atfork and existing implementation practice.
> The patch as written takes care to skip all locking in single-threaded
> parents, so it does not break AS-safety property in single-threaded
> programs that may have made use of it. -Dfork=_Fork can also be used
> to get an AS-safe fork, but it's not equivalent to old semantics;
> _Fork does not run atfork handlers. It's also possible to static-link
> an alternate fork implementation that provides its own pthread_atfork
> and calls _Fork, if really needed for a particular application.
> 
> Feedback from distro folks would be very helpful -- does this fix all
> the packages that 1.2.1 "broke"?

Another bug:

> diff --git a/src/process/fork.c b/src/process/fork.c
> index a12da01a..ecf7f376 100644
> --- a/src/process/fork.c
> +++ b/src/process/fork.c
> @@ -1,13 +1,81 @@
>  #include <unistd.h>
>  #include "libc.h"
> +#include "lock.h"
> +#include "pthread_impl.h"
> +#include "fork_impl.h"
> +
> +static volatile int *const dummy_lockptr = 0;
> +
> +weak_alias(dummy_lockptr, __at_quick_exit_lockptr);
> +weak_alias(dummy_lockptr, __atexit_lockptr);
> +weak_alias(dummy_lockptr, __dlerror_lockptr);
> +weak_alias(dummy_lockptr, __gettext_lockptr);
> +weak_alias(dummy_lockptr, __random_lockptr);
> +weak_alias(dummy_lockptr, __sem_open_lockptr);
> +weak_alias(dummy_lockptr, __stdio_ofl_lockptr);
> +weak_alias(dummy_lockptr, __syslog_lockptr);
> +weak_alias(dummy_lockptr, __timezone_lockptr);
> +weak_alias(dummy_lockptr, __bump_lockptr);
> +
> +weak_alias(dummy_lockptr, __vmlock_lockptr);
> +
> +static volatile int *const *const atfork_locks[] = {
> +	&__at_quick_exit_lockptr,
> +	&__atexit_lockptr,
> +	&__dlerror_lockptr,
> +	&__gettext_lockptr,
> +	&__random_lockptr,
> +	&__sem_open_lockptr,
> +	&__stdio_ofl_lockptr,
> +	&__syslog_lockptr,
> +	&__timezone_lockptr,
> +	&__bump_lockptr,
> +};
>  
>  static void dummy(int x) { }
>  weak_alias(dummy, __fork_handler);
> +weak_alias(dummy, __malloc_atfork);
> +weak_alias(dummy, __ldso_atfork);
> +
> +static void dummy_0(void) { }
> +weak_alias(dummy_0, __tl_lock);
> +weak_alias(dummy_0, __tl_unlock);
>  
>  pid_t fork(void)
>  {
> +	sigset_t set;
>  	__fork_handler(-1);
> +	__block_app_sigs(&set);
> +	int need_locks = libc.need_locks > 0;
> +	if (need_locks) {
> +		__ldso_atfork(-1);
> +		__inhibit_ptc();
> +		for (int i=0; i<sizeof atfork_locks/sizeof *atfork_locks; i++)
> +			if (atfork_locks[i]) LOCK(*atfork_locks[i]);
                            ^^^^^^^^^^^^^^^

Always non-null because it's missing a level of indirection; causes
static linked program to crash. Should be if (*atfork_locks[i]).

> +		__malloc_atfork(-1);
> +		__tl_lock();
> +	}
> +	pthread_t self=__pthread_self(), next=self->next;
>  	pid_t ret = _Fork();
> +	if (need_locks) {
> +		if (!ret) {
> +			for (pthread_t td=next; td!=self; td=td->next)
> +				td->tid = -1;
> +			if (__vmlock_lockptr) {
> +				__vmlock_lockptr[0] = 0;
> +				__vmlock_lockptr[1] = 0;
> +			}
> +		}
> +		__tl_unlock();
> +		__malloc_atfork(!ret);
> +		for (int i=0; i<sizeof atfork_locks/sizeof *atfork_locks; i++)
> +			if (atfork_locks[i])
                            ^^^^^^^^^^^^^^^

And same here.

> +				if (ret) UNLOCK(*atfork_locks[i]);
> +				else **atfork_locks[i] = 0;
> +		__release_ptc();
> +		__ldso_atfork(!ret);
> +	}
> +	__restore_sigs(&set);
>  	__fork_handler(!ret);
>  	return ret;
>  }
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.