Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151115004942.GA31291@brightrain.aerifal.cx>
Date: Sat, 14 Nov 2015 19:49:42 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: undef SHARED, and refactoring dynlink.c

The project to remove the SHARED macro, which makes it impossible to
use the same set of .o files for libc.a and libc.so even when you want
both built as PIC, is almost complete; all that's left is in the
dynamic linker, and the only nontrivial usage is in dynlink.c.

Once this transition is done, all source files in src/* should produce
object files which are suitable for static linking into an application
or for inclusion in libc.so for dynamic linking. The only difference
will be that libc.so will also link one or more additional object
files from a new "ldso" directory under the top-level directory (i.e.
not under src), yielding 3 top-level directories (not counting the
arch dirs) with source files:

- crt: source for installable .o files
- src: source for libc
- ldso: source for dynamic linker component of libc.so

The big question is how to get to that point. Right now dynlink.c (and
to a lesser extent, other src/ldso/*.c files) has a lot of code that's
presently only relevant to dynamic linking, but which conceptually
could be used in libc.a if/when we add a working static-linked dlopen.

The short-term cheat I have in mind is the following: leave dynlink.c
alone, with #ifdef SHARED, except make the stub dl* functions in the
#ifndef SHARED case weak so that strong definitions can override them.
Then the top level ldso directory can contain a file which simply
does:

#define SHARED
#include "../src/ldso/dynlink.c"

When libc.so is linked, this will replace the weak definitions of the
dl* functions with the working versions (built when SHARED is defined)
and add the dynamic linker startup code.

What this does is achieve the build-system/architectural goal of
removing the SHARED/!SHARED distinction at the src/* level, while not
necessitating immediate invasive code changes.

As part of the same project, since the dlstart.c code does not belong
in libc.a and thus not in src/*, I'd like to move it to crt/rcrt1.c,
and then have ldso/dlstart.c include ../crt/rcrt1.c, with some macros
defined to adapt it to act as the dynamic linker entry point. This
reverses the current direction of inclusion and makes the preprocessor
logic less messy, I think.

In the longer term, I think heavier refactoring of the dynamic linker
code is in order, and will allow us to offer some reasonable level of
dynamic loading in static-linked programs. The code can be broken up
into the following parts:

- Module bookkeeping. Needed by all components. This requires both a
  stub implementation in src/* used for static linking (initially it
  could support only one module, the main program, with dlopen
  integration coming later) and support code in ldso/* for the dynamic
  linker to initialize the module list at startup.

- Library loading and relocation processing. Needed if dlopen is
  linked (or for dynamic linker entry point, but if that's linked
  dlopen is too, anyway).

- Symbol lookup. Needed if dlsym is linked, or for dynamic linker or
  dlopen symbol resolution at load time. This component has the
  special constraint that it must be usable by stage-2 of the dynamic
  linker bootstrap, so it can't use symbols. We could use hidden
  visibility rather than static functions to make this code callable
  from stage-2 while moving it to its own shared translation unit,
  though.

- TLS support code. This can almost entirely be moved out of the
  dynamic linker by using the new TLS module list instead of the DSO
  module list.

- Dynamic linker startup/main code, including code to replace weak
  definitions of libc init/fini functions, etc.

- Additional "libdl" functions which are largely independent but use
  the above.

All of the above design ideas are purely architectural; they don't
address the questions of semantics for dlopen in static-linked
programs, of which there are many, but I'd like to leave those
questions for if/when we do static dlopen. Even without making static
dlopen actually do something useful, the above concepts apply as a way
to make the static versions of the dl* functions fail or work
minimally simply as a consequence of having a trivial module list,
rather than via hard-coded #ifdef to fail.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.