Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200815030202.GT3265@brightrain.aerifal.cx>
Date: Fri, 14 Aug 2020 23:02:02 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Restrictions on child context after multithreaded fork

On Fri, Aug 14, 2020 at 08:07:26PM -0600, Ariadne Conill wrote:
> Hello,
> 
> On 2020-08-14 15:41, Rich Felker wrote:
> >musl 1.2.1 has exposed bugs in several applications and libraries
> >caused by async-signal-unsafe code between (multithreaded) fork and
> >subsequent exec. So far, dbus library code, pulseaudio library code,
> >and libvirt have been found to be affected. A couple of the bug
> >reports (with incomplete information) are:
> 
> I suspect the dbus library code has been broken for some amount of
> time because we've been tracking an issue where dbus stalls on KDE
> Wayland for a short period of time causing the session to come up
> strangely. 1.2.1 did make this stall more pronounced though.

Yes, the open bug on their tracker goes back something like 3 years or
more, with no resolution or progress.

> >https://gitlab.alpinelinux.org/alpine/aports/-/issues/11602
> >https://gitlab.alpinelinux.org/alpine/aports/-/issues/11815
> >
> >Fixing the affected library code looks very straightforward; it's just
> >a matter of doing proper iterations of existing data/state rather than
> >allocating lists and using opendir on procfs and such. I've discussed
> >fixes with Alpine Linux folks and I believe fixes have been tested,
> >but I don't see any patches in aports yet.
> 
> I'll work on getting patches in next week.

Great!

> >I've seen suspicions that the switch to mallocng exposed this, but I'm
> >pretty sure it was:
> >
> >https://git.musl-libc.org/cgit/musl/commit/?id=e01b5939b38aea5ecbe41670643199825874b26c
> 
> We carried this patch before upgrade to 1.2.1 and it was mostly
> fine, has been carried since 1.1.24-r7 AFAIK.

A. Wilcox reported similar findings, so either something more subtle
is going on, or the per-bin locks in oldmalloc just made it less
likely to hit the same lock in the child.

> >In case we do want to follow a direction of trying to provide some
> >degree of relaxation of restrictions on the child (taking the liberty
> >of POSIX-future drop of fork's AS-safety requirement), I did a quick
> >survey of libc-internal locks, and found:
> 
> I think it is better to fix programs to not depend on AS-unsafe
> functions at fork time.  Being lax on this requirement is an
> indicator of other bad engineering decisions in these programs,
> especially libvirt and pulseaudio.

My leaning is the same -- the code is doing sketchy stuff the authors
know is wrong, and should be fixed. And with current musl, it should
reliably deadlock if there's a problem, rather than corrupting memory
like in the past, so there's not a safety motivation for changing
this.

Since POSIX is dropping the AS-safety requirement from fork, though,
there is more legitimacy to arguments that libc "should" support this
usage (even though POSIX does not seem to be taking any action to
support such an expectation, just to make it so glibc is no longer
non-conforming in this regard).

With that said, I think it's at least something where we should watch
what other implementations are doing, and whether it becomes a QoI
issue. One compelling argument for doing it is that it makes it more
possible to link code with conflicting expectations, and so use of
threads as an implementation detail in libraries doesn't "leak"
impositions on the application using the library. Right now there are
sort of two competing perspectives:

1. Traditional use of fork is a very normal thing, and because
   secretly using threads behind the scenes can break that, you
   shouldn't secretly use threads behind the scenes.

2. Using threads as an implementation detail of library code is a very
   normal thing, and because forking and doing non-AS-safe stuff after
   you fork can break that, you shouldn't use fork in that way.

I'm strongly in camp 2 (for other reasons as well; I have plenty
reasons to hate fork and want to see it deprecated and replaced by
whatever enhancements to posix_spawn are needed to make users happy)
but I do see value in reconciling the two camps, too.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.