Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 10 Apr 2018 11:35:50 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: tcmalloc compatibility

On Tue, Apr 10, 2018 at 10:45:03AM -0400, Bobby Powers wrote:
> On Tue, Apr 10, 2018 at 2:34 PM Rich Felker <dalias@...c.org> wrote:
> > This claim doesn't seem to be well-justified. Myself and members of
> > our community have written a lot on why existing malloc interposition
> > hacks are broken, but there's also an interest in what would take to
> > make it work, and I particularly am interested in this from a
> > standpoint that musl's malloc is not very good, and that being able to
> > dynamically interpose it would facilitate developing and testing a
> > replacement.
> 
> This sounds super interesting -- what needs to happen to make progress
> on this?  I would love to help out.

For allowing interposition, it's mainly working out policy so that
it's clear what can and can't be supported and we don't get stuck
seemingly promising something impossible. The actual code changes are
fairly small. We'd need to switch from -Wl,-Bsymbolic-functions when
linking to -Wl,--dynamic-list in order to exclude the malloc functions
from being bound at link time, and some changes might be necessary in
the dynamic linker in how it deals with donating gaps to malloc and
early allocations before the interposed malloc is available. There's
also a question of whether the dynamic linker should have code to
detect and refuse to run with incorrect malloc interposition (some but
not all functions interposed).

But back to the point, the main issue is specifying the constraints on
the interposing functions.

> > Note however that if malloc interposition is supported at some point,
> > there will be a specification for constraints on the malloc
> > implementation including what functions you can call from it (e.g.
> > something like AS-safety), and bug reports for implementations that do
> > things outside this spec (and thereby inherently can't work safely or
> > reliably) will not be considered bugs.
> 
> That sounds reasonable.  Some existing software (like Hoard) goes out
> of its way to interpose on all functions that might call into malloc
> to ensure the system allocator isn't called indirectly:
> 
> https://github.com/emeryberger/Heap-Layers/blob/master/wrappers/wrapper.cpp

This is really impossible to do correctly, for multiple reasons:

1. Some such functions are fundamentally not replacable, like the
   dynamic linker functions (dlopen).

2. There is no specification for which libc functions call into
   malloc; this is an implementation detail. The only related things
   that are parts of the public contract are whether they return
   memory "as if by malloc" and whether they're AS-safe or AC-safe (in
   which case it's not formally correct, but it's fairly reasonable to
   assume they don't call malloc). For example on glibc, qsort and
   printf call malloc (but qsort has, and necessarily has to have, a
   fallback for when it fails since qsort cant' fail).

3. Some functions which use malloc are sufficiently heavy that you'd
   be replacing (and possibly changing or reducing functionality in)
   whole major libc components if you wanted to replace them. For
   example, getaddrinfo (the whole resolver infrastructure), iconv,
   regex, ...

Note that, unless the malloc replacement and the system malloc somehow
step on each other's state, there's no harm in having both present and
getting called as long as the libc functions that return memory "as if
by malloc" (thus that's permissible to pass to realloc and free) use
the interposed malloc replacement. This would just be things like
strdup. So if you only replace those, it's a more managable task. But
I still think it's a wrong approach.

If musl does add support for malloc interposition, I'm strongly
leaning towards using the interposed malloc everywhere so that this
kind of issue does not matter. Otherwise there are too many
opportunities for subtle errors. The main argument for not calling the
interposed malloc from libc except when you have to is that you don't
have to deal with reentrancy & inconsistent state issues that could be
prone to incorrect usage, but getline() inherently has to return
memory as if by malloc, and thus you're already stuck with at least
one function that has to call the interposed malloc with stdio locks
held (or else work in a temp buffer managed by internal malloc, then
only move to a public-malloc-allocated buffer after finishing, but
that's an awful hack, and imposing libc implementation constraints
like that around the allowance for interposing malloc is exactly the
type of nasty situation I don't want to get into).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.