Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121208225237.GV20323@brightrain.aerifal.cx>
Date: Sat, 8 Dec 2012 17:52:37 -0500
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: static linking and dlopen

On Sat, Dec 08, 2012 at 08:09:35PM +0200, Paul Schutte wrote:
> Hi,
> 
> I have a strong preference towards static linking these days because the
> running program use so much less memory.
> 
> When I link a binary statically and that binary then use dlopen, would that
> work 100% ?

Presently, it does not work at all. At best, it loses all the
advantages of static linking.

> What would open if the shared object that was dlopened want's to call
> functions in other shared libraries ?

Dependencies of any loaded library also get loaded.

> I understand that when using dynamic linking those libraries would just get
> loaded, but I am not sure what would happen with static linking.

With static linking, they would have to be loaded too. This means a
static-linked program using dlopen would have to contain the entire
dynamic linker logic. What's worse, it would also have to contain at
least the entire libc, and if you were using static versions of any
other library in the main program, and a loaded module referenced a
dynamic version of the same library, you'd probably run into
unpredictable crashing when the versions do not match exactly.

The source of all these problems is basically the same as the benefit
of static linking: the fact that the linker resolves, statically at
link time, which object files are needed to satisfy the needs of the
program. With dlopen, however, there is no static answer; *any* object
is potentially-needed, not directly by the main program, but possibly
by loaded modules. Consider what happens now if you only link part of
libc into the main program statically: additional modules loaded at
runtime won't necessarily have all the stuff they need, so dlopen
would also have to load libc.so. But now you're potentially using two
different versions of libc in the same program; if
implementation-internal data structures like FILE or the pthread
structure are not identical between the 2 versions, you'll hit an ABI
incompatibility, despite the fact that these data structures were
intended to be implementation-internal and never affect ABI. Even
without that issue, you have issues like potentially 2 copies of
malloc trying to manage the heap without being aware of one another,
and thus clobbering it.

For libc, the issues are all fixable by making sure that a static
version of dlopen depends on every single function in libc, so that
another copy never needs to get loaded. However, for other static
libraries pulled into the main program, there is really no fix without
help from the linker (it would have to pull in the entire library, and
somehow leave a note for dlopen to see that library is already loaded
and avoid loading it dynamically too).

Note that, even if we could get this working with a reasonable level
of robustness, almost all the advantages of static linking would be
gone. Static-linked programs using dlopen would be huge and ugly.

If you really want to make single-file binaries with no dependencies
and dlopen support, I think the solution is to first build them
dynamically linked, then merge the main program and all .so files into
a single ELF file. I don't know of any tools capable of doing this,
but in principle it's possible to write one. There are at least 2
different approaches to this. One is to process the ELF files and
merge their list of LOAD segments, symbol and relocation tables, etc.
all into a single ELF file, leaving the relocations in place for the
dynamic linker to perform at startup. This would require some
modification to the dynamic linker still. The other approach is the
equivalent of emacs' unexec dumper: place some kind of hook to run
after the dynamic linker loads everything, but before any other
application code runs, which dumps the entire memory space to an ELF
file which, when run, will reconstruct itself.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.