musl - Re: Protect pthreads' mutexes against use-after-destroy

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJgzZoqpWme-SeWtrggQaxiCNE_6jr88u4mw68ZV4cqzDx6kew@mail.gmail.com>
Date: Fri, 12 Jan 2024 08:53:54 -0800
From: enh <enh@...gle.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com, jvoisin <julien.voisin@...tri.org>
Subject: Re: Protect pthreads' mutexes against use-after-destroy

On Tue, Jan 9, 2024 at 5:58 PM Rich Felker <dalias@...c.org> wrote:
>
> On Tue, Jan 09, 2024 at 03:27:37PM -0800, enh wrote:
> > On Tue, Jan 9, 2024 at 11:07 AM Rich Felker <dalias@...c.org> wrote:
> > >
> > > On Tue, Jan 09, 2024 at 03:37:17PM +0100, jvoisin wrote:
> > > > Ohai,
> > > >
> > > > as discussed on irc, Android's bionic has a check to prevent
> > > > use-after-destroy on phtread mutexes
> > > > (https://github.com/LineageOS/android_bionic/blob/e0aac7df6f58138dae903b5d456c947a3f8092ea/libc/bionic/pthread_mutex.cpp#L803),
> > > > and musl doesn't.
> > > >
> > > > While odds are that this is a super-duper common bug, it would still be
> > > > nice to have this kind of protection, since it's cheap, and would
> > > > prevent/make it easy to diagnose weird states.
> > > >
> > > > Is this something that should/could be implemented?
> > > >
> > > > o/
> > >
> > > I think you meant that the odds are it's not common.
> >
> > it was common enough (and hard enough to debug) that we added this
> > "best effort" error detection to bionic :-)
>
> Thanks! That's useful information (and disturbing...)

yeah ... dan albert (who owns the Android NDK) points out that with a
large enough ecosystem, any mistake you can imagine -- and many you
can't -- aren't just possible, they're probable :-)

> > > There's already
> > > enough complexity in the code paths for supporting all the different
> > > mutex types that my leaning would be, if we do any hardening for
> > > use-after-destroy, that it should probably just take the form of
> > > putting the object in a state that will naturally deadlock or error
> > > rather than adding extra checks to every path where it's used.
> >
> > yeah, the _other_ reason we have the abort is that we've struggled
> > over the years to make it clear to the _callers_ that -- just because
> > you crash/hang in libc -- it's the _caller's_ bug. explicitly saying
> > so helps. (though we still get a decent number of people who don't
> > read/don't understand.)
>
> That's a problem we've just not tried to address in musl, but indeed
> it does happen fairly often. If we did try at some point, I think it
> would be by setting up a state to make the reason more clear in the
> debugger, not producing any output. But personally I'd like to keep
> driving home the message that the point of crash does not mean
> anything about responsibility for the crash, even if it means
> responding to lots of bogus "bug reports"...

from the other thread on the musl list right now where someone doesn't
realize they need to type `run` in the debugger, i'll let you draw
your own conclusion about how successful i think that's likely to be
:-) the trouble with the "someone's wrong on the internet, i must
correct them" approach is that it doesn't scale...

but, yes, we've had some value from a related idea of having the
software that generates a crash dump do the "first level triage" of a
crash and explicitly say things like "null pointer dereference" or
"this code aborted --- see if it logged anything first" and so on.

> Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.