Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200423122233.GH23945@port70.net>
Date: Thu, 23 Apr 2020 14:22:34 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: Paul Sokolovsky <pmiscml@...il.com>
Cc: Rich Felker <dalias@...c.org>, musl@...ts.openwall.com
Subject: Re: foreign-dlopen: dlopen() from static binary, again (and
 not the way you think!)

* Paul Sokolovsky <pmiscml@...il.com> [2020-04-23 12:16:26 +0300]:
> Hello,
> 
> On Wed, 22 Apr 2020 22:39:41 -0400
> Rich Felker <dalias@...c.org> wrote:
> 
> []
> 
> > > Oh, forgot to say that I'm not looking for a way to load a
> > > particular musl-dynlinked shared library into musl-staticlinked
> > > binary. So, arguments like "but you'll need to carry around musl's
> > > libc.so" don't apply. What I'm looking for is a way to have a
> > > static closed-world application, but let it, at the user's request,
> > > to interface with whatever system may be outside.
> []
> > > of concept code is at
> https://github.com/pfalcon/foreign-dlopen .  
> > 
> > In your example it looks like you're foreign_dlopen'ing glibc. That
> > simply *can't* work, because part of the interface contract of all
> > glibc functions is that they're called with the thread pointer
> > register (%gs or %fs on i386 or x86_64 respectively) pointing to a
> > glibc TCB, which will not be the case when they're invoked from a
> > musl-linked (or other non-glibc-linked) program.
> 
> Thanks for the response and for the word of warning. As I mentioned,
> this is essentially a proof of concept, and so far was tested only by
> calling glibc's printf() from a host app which was either linked with
> glibc itself or -nostdlib and static. And that was already more than
> with any other ELF loader which I tried (which worked for simple
> functions like write(), but crashed in anything more complex like
> printf()).
> 
> But it certainly doesn't touch a case you describe, when "foreign" vs
> local libc expect different values of %gs/%fs (so apparently, "foreign
> function call" facility would need to swap them around a call).

yes, libc functions should be called on libc owned
threads and your code can only run on the same thread if
you follow the same abi (which is more than just the
call convention), swapping the thread pointer means that
the foreign libc has to create the thread on which you
invoke the foreign function (or it has to be the main
thread) since the data structures at tp are set up at
thread creation (or early libc init for the main thread).

what's worse is that some process global state also
has to be under the control of libc (e.g. libc internal
signal handlers or global state controlled via prctl or
libc may want fd 0,1,2 in a particular state) so cross
calling a different libc involves system calls (e.g. the
go runtime gets this wrong for obvious reasons: calling
c from go would be really slow, this is why you normally
try to avoid using your own libc independent runtime.
go gets away with this because libc internal signals are
rarely relevant and most process state is per thread on
linux so if you let the foreign libc to create the os
threads and take over the signal handlers and signal
masks then things work)

> > If you relax to the case where you're not doing that, and instead only
> > opening *pure library* code which has no tie-in to global state or TLS
> > contracts, then it should be able to work.

it's not documented what api is implemented as pure
library code and in principle libc code may call
other libc code via plt and then lazy binding can
happen which is not pure. (glibc tries to avoid this
of course, but it does have some runtime loaded
components e.g. for locale specific char conversions
so things that may seem pure from the outside can end
up unpure).

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.