musl - Re: dlsym(handle) may search in unrelated libraries

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190207183637.GF5469@voyager>
Date: Thu, 7 Feb 2019 19:36:38 +0100
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: Re: dlsym(handle) may search in unrelated libraries

On Thu, Feb 07, 2019 at 11:54:47AM -0500, Rich Felker wrote:
> Yes, it needs to contain all direct and indirect dependencies, in the
> correct order. Right now I think it incorrectly contains a superset of
> that. Is that your assessment of the situation too?
> 

Yes, it is a superset, if the original argument to load_deps() is not
the tail of the chain of libraries.

> However, a depth-first list of dependencies is also needed to solve
> the longstanding ctor-order issue discussed in several past threads
> (which I can look up if search doesn't turn them up for you). This is
> not a requirement of POSIX (which doesn't even have ctors), but it's a
> reasonable expectation we currently get wrong and I think it might be
> specified in ELF or some related sysv "standards".
> 

Requirements first, nice-to-haves later, right? Once we have the dlsym()
conformance issue squared away, we can still deal with this issue.

You mean, people expect their dependencies to be initialized in their
own initializers? When it is well-known that there is no order to
initializers, and e.g. the C++ standard says there is no order to
constructors of static objects.

> I'm not sure if it makes sense to build both of these lists at the
> same time. We currently try to avoid allocating dependency lists for
> libraries loaded at startup in order not to allocate and setup data
> that might never be used, defering until if/when dlopen is called on
> them. I've wanted to do the ctor order walk without allocating a
> dependency list, but I don't know a good way to do so. Note that the
> code that runs ctors cannot allocate because, at the time it runs,
> dlopen has already "committed" the load and can't back it out. It has
> to have all resources it needs for the walk precommitted.
> 

Depth first means stack. Means recursion or explicit stack. Explicit is
going to be hard, considering there is no known limit to dependency
nesting depth. We could argue that the user better make their stack size
ulimit large enough for the main thread, but I hazard a guess that
nobody expects dlopen() to use overly much stack in other threads.

Alright, what's the algorithm here?

init(lib):
    if (!lib.inited):
        foreach d in lib.deps init(d)
        start_init(lib)
        lib.inited = 1

That about right? Because that means we need a stack as high as the
nesting depth of dependencies. We can get a pessimistic estimation from
the number of deps loaded, but that may be way too high (see below).

> Since the beginning of a list of breadth-first dependencies is just
> the direct dependencies, if we recorded the number of direct
> dependencies for each dso, the breadth-first lists could be used to
> perform a depth-first walk to run ctors; this can be made
> non-recursive, at the cost of quadratic time but with trivially-tiny
> constant factor, by simply restarting at the root of the tree after
> each node and finding the deepest not-yet-constructed dso. This
> suggests always allocating the breadth-first dependency lists might be
> a good idea. That "feels" unfortunate since in principle they can get
> large, but it's only going to be large for ridiculous bloatware with
> hundreds of libraries, in which case these dependency lists are very
> small on a relative scale.
> 

$ ldd =mpv | wc -l
292

And that, I will remind you, is the "suckless" alternative to mplayer.

Anyway, that is breadth, not depth. Experimentally, it is hard to find a
chain of libraries deeper than ten. They all eventually just lead back
to libc.

So what you could do, when it comes time to run the initializers, is to
walk the dyn vectors again. Those are already allocated, and all the
libs they reference are already loaded, so load_library() would only be
a search operation. By the time the initializers run, that must be the
case, both at load time and at runtime. Unless loading the library
failed the first time, in which case at runtime we have already bailed.
And at load time we... haven't. Are you sure continuing on load failure
of a dependency during load time is a good idea?

> I don't think recursive can be done safely since you don't have a
> small bound on levels of recursion. It could be done with an explicit
> stack in allocated storage though, since the operation has to allocate
> memory anyway and has to fail if allocation fails.
> 
> Rich

Nevermind, it was a misunderstanding of the requirements.

Ciao,
Markus
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.