Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190207053327.GD5469@voyager>
Date: Thu, 7 Feb 2019 06:33:27 +0100
From: Markus Wichmann <nullplan@....net>
To: Alexey Izbyshev <izbyshev@...ras.ru>
Cc: musl@...ts.openwall.com
Subject: Re: dlsym(handle) may search in unrelated libraries

On Thu, Feb 07, 2019 at 12:23:06AM +0300, Alexey Izbyshev wrote:
> On 2019-02-06 23:25, Markus Wichmann wrote:
> > Right you are. It took me a while to understand what the deps array was
> > even for (since musl's dlclose() doesn't do anything, tracking
> > dependencies is mostly pointless), but I found it is needed for lazy
> > relocation processing. So it is necessary for all libs opened by
> > dlopen() directly to contain a list of all their dependencies. All the
> > other libs can have an empty list.
> 
> Actually, dso->deps is used in dlsym(handle) because it must use the
> dependency order for symbol search, so it's incorrect to have deps empty for
> "all the other" libs. Consider the following modification of my previous
> example:
> 
> $ cat bazdep.c
> int bazdep = 1;
> extern int bazdepdep;
> int *p = &bazdepdep;
> $ cat bazdepdep.c
> int bazdepdep = 2;
> $ cat main.c
> #include <dlfcn.h>
> #include <stdio.h>
> 
> int main(void) {
>   if (!dlopen("libbaz.so", RTLD_NOW|RTLD_LOCAL))
>     return 1;
>   if (!dlopen("libfoo.so", RTLD_NOW|RTLD_LOCAL))
>     return 1;
>   void *h = dlopen("libbazdep.so", RTLD_NOW|RTLD_LOCAL);
>   printf("%p\n", dlsym(h, "bar"));
>   printf("%p\n", dlsym(h, "bazdepdep"));
> }
> 
> The correct output is zero in the first line and some non-zero address in
> the second. Vanilla musl 1.1.21 prints two non-zero addresses. But with your
> patch the output is two zeros because dlsym() can't search in dependencies
> of "libbazdep.so" anymore.
> 
> Alexey

OK, so life just got more interesting. I gather the deps handling was
always incorrect.

Let's consider the original code. liba depends on libb, which depends on
libc. dlopen("liba") returns a handle with libb and libc in the deps,
but libb->deps == 0. If we now call dlopen("libb"), that does the right
thing, but only because libb happens to be the last lib in the chain. If
we'd have loaded libx, liby, and libz before trying libb, it would add
all the symbols of libs x, y, and z to the libb handle.

I guess the hope was that this situation never arrises. So how do we fix
this?

I think the easiest is probably going to be to patch up load_deps, but
avoiding recursion is going to be the fun part. My plan is to make
dso->deps contain all direct and indirect dependencies (which is what
the code seems to depend on, anyway). This is going to consume more
memory, but we are talking a few pointers, and we are dealing with
shared libs, anyway.

As you said, order is important. What is the correct order, depth-first
or breadth-first? I think it should be depth-first, but lack any
authoritative knowledge on this. It would make the most sense, anyway
(if, from the point of view of a user a library contains all the symbols
of its dependencies, then those dependencies must also contain all the
symbols of their dependencies). So with the following dependency tree:

liba->libb->libc
    `>libx->liby

the handle for liba would list libc before libx.

Easiest implementation is probably still going to be recursive. Let's
hope the dependency trees don't get too wild.

I'll look into it after work.

Ciao,
Markus

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.