Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <03A11083-A7D8-409E-BA70-AC42F52FF7B2@mac.com>
Date: Wed, 27 Jul 2022 19:06:24 -0400
From: Christopher Sean Morrison <brlcad@....com>
To: musl@...ts.openwall.com
Subject: dynamic linker is capturing "reserved" library names erroneously


The gist of this bug report / change request is best demonstrated by the following code setup:

$ cat >> librt.cpp
#include <stdio.h>
void foo(void) { printf("hello\n"); }
$ cat >> test.cpp
#include <stdio.h>
int main(int ac, char *av[]) { extern void foo(void); foo(); return 0; }
$ g++ -shared -o librt.so librt.cpp
$ g++ -o test test.cpp -L. -lrt -Wl,-rpath=.
$ ldd test
	/lib/ld-musl-x86_64.so.1 (0x7f35cf52a000)
	librt.so => /lib/ld-musl-x86_64.so.1 (0x7f35cf52a000)
	libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f35cf52a000)
Error relocating test: _Z3foov: symbol not found
$ ./test
Error relocating ./test: _Z3foov: symbol not found

In brief, ld appears to be capturing the resolution of librt as being satisfied by ld-musl-x86_64 at runtime despite the -L resolving correctly at compilation time (and rpath is ignored).  This naturally results in runtime symbol not found message(s) for any symbols in such a named library.  The offending code in musl appears to be https://git.musl-libc.org/cgit/musl/tree/ldso/dynlink.c#n1011 <https://git.musl-libc.org/cgit/musl/tree/ldso/dynlink.c#n1011>

First consideration, the code seems to take a position that those library names are somehow universally reserved and I believe that to be incorrect.  There was mention of reserved lib names long ago specifically regarding the behavior of the “c89”  and “c99” C compiler CLI in their posix.1 man pages, but those did not carry over to prescribing dynamic library behavior, C++ behavior, or other any other aspects of the standard as far as I’m aware.  Perhaps a citation can be provided, but I was unable to find a relevant mention of ‘rt’, “librt”, “-lr”, etc, (or any of the other libs from line 1011) in the latest version of the standard.

Second consideration, the block’s preceding comment seems to document the primary intention as being to reduce app porting burden where -lm, -lc, and friends have long-since been embedded in build systems.  Given those libraries are all combined in musl’s libc implementation, that seems reasonable.  Automatically binding requests for -lm to musl’s libc, for example, certainly makes sense to ease porting.

Given those two considerations, I would suggest + request that the library resolution behavior be changed so that they are not captured reserved names but merely a fallback when normal searching would otherwise result in not-found.  That is, move the logic in dynlink.c so that it happens later, after typical searching as needed.  That way, the encoded rpath and libraries specified at link time will be respected, and build systems specifying -lm will still automatically resolve to musl’s libc (unless there really is a libm).

My expectation is that user applications should be able to specify any library name (even libc), link against it, and resolve to it at runtime as in the example above.  The gnu and bsd dynamic linkers do not make any presumption or restriction on library name resolution (at least not any more afaik), even with regards to auto-linking standard libraries.  Changing this behavior would also address a number of related reports I came across regarding this issue (e.g., search "error relocating" "symbol not found" musl <https://www.google.com/search?q=%22error+relocating%22+%22symbol+not+found%22+musl&client=safari&rls=en&biw=1413&bih=1095&sxsrf=ALiCzsbL2SRfDzU58gJPcpF0YXZM1Q2WwQ%3A1658961343157&ei=v73hYqOYCY2s5NoP8rmEwA8&ved=0ahUKEwjj87fMkJr5AhUNFlkFHfIcAfgQ4dUDCA0&uact=5&oq=%22error+relocating%22+%22symbol+not+found%22+musl&gs_lcp=Cgdnd3Mtd2l6EAMyBggAEB4QFjIFCAAQhgMyBQgAEIYDMgUIABCGAzoFCAAQgAQ6BggAEB4QB0oECEEYAUoECEYYAFD-A1i2HWD8HWgBcAB4AIABsQGIAdAEkgEDNC4ymAEAoAEBwAEB&sclient=gws-wiz>).

For background context, I maintain BRL-CAD, a large open source CAD system that has been in development for over 40 years.  BRL-CAD’s flagship API with thousands of integrations around the world is the “librt” ray tracing library.  With dev going all the way back to 1983, it predates both ANSI C and POSIX.  Renaming isn’t likely anytime soon as it would be exceptionally cost-prohibitive and impacts so many other codes in production use.  We’ve maintained BRL-CAD’s portability across dozens of architectures, operating systems, and compilation environments over the years, and librt conflicts are not new.  What’s unique/novel here is musl’s runtime linker override behavior that hasn’t been seen for quite some time.  I found workarounds we can employ, but hopefully behavior can be improved to benefit future musl development.

Thank you for everyone’s efforts on an alternative to the established, and thank you for consideration of this issue.

Cheers!
Sean Morrsion
BRL-CAD


Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.