|
Message-ID: <20220804224455.GA3232548@juliacomputing.com> Date: Thu, 4 Aug 2022 18:44:55 -0400 From: Keno Fischer <keno@...iacomputing.com> To: libc-coord@...ts.openwall.com Subject: Proposing dl* extensions with explicit caller specification Dear libc maintainers, I'm hoping to coordinate consensus on a dlfcn API extension to address a common paper cut that users encounter when attempting to use various instrumentation tooling such as the {address, memory, thread} sanitizers (and others). I don't think the implementation is particularly difficult, but as it touches core dlfcn API surface, some consensus would be required among libc implementations to avoid making a mess. # The problem A little known quirk of the dlsym and (on certain implementations) dl(m)open APIs is that their behavior depends on the calling shared object. This shared object is usually determined using __builtin_return_address, or a hand-coded equivalent (e.g. reading the top of stack of x86_64 or accessing the lr registers on aarch64). This implicit dependence on the return address (apart from feeling a bit like an API smell) breaks the ability to use symbol interposition on these functions, as the usual interposition/RTLD_NEXT pattern will result in the call appearing to come from a different shared object than the non-interposed call. This is a regular cause of end user complaints (see e.g. [1-7]). A common suggestion is to use LD_LIBRARY_PATH in order to work around the missing caller-dependent RUNPATH lookup. However, as I will survey below, RUNPATH is not the only caller-dependent property (so the workaround is incomplete) and setting LD_LIBRARY_PATH may affect lookups in other parts of the application (or any spawned children) in undesirable ways (so the workaround is potentially harmful to correct operation). A different suggestion that was previously made (e.g. in [7]) is to switch the interceptors to a tail call. Where possible, this does address indeed address the issue (e.g. rr's interceptor [8] does this and doesn't suffer from the same problem). Unfortunately, this is not always possible. For example, the memory sanitizer interceptor [9] needs to introspect the loaded object in order to set up shadow memory for all newly added mappings. The tail call issue also brings up a related concern: Compiler optimizations do not model the return-address dependence of these functions and will thus happily move them into tail call position when possible, raising the possibility that a compiler upgrade will cause dynamic linker behavior to change. # A brief survey of current caller-dependence in libcs How the return address is used is not consistent between different libcs. Perhaps the most consistent use of the return address is in RTLD_NEXT. POSIX specifies that: ``` RTLD_NEXT Specifies the next executable object file after this one that defines name. This one refers to the executable object file containing the invocation of dlsym(). ``` Because of the above mentioned tail-call issue, arguably the implementation using __builtin_return_address is not POSIX compliant, because the return address may not necessarily be the `object containing the invocation of dlsym`. Nevertheless, this is a minor issue and not generally what users run into. The more common situation of return-address dependence is in `dlopen`. POSIX makes no mention of return-address dependence in dlopen, so implementations differ somewhat in their use of the return address in dlopen context. For implementations that provide the `dlmopen` extension (e.g. Solaris/Illumos, glibc), the return address is generally used by `dlopen` to identify the calling objects's namespace. Implementations without this extension that I surveyed (e.g. musl libc, FreeBSD libc), generally do not have caller dependence in dlopen (if there is one, I would love to know about it so I can add it to the list). For implementations that do look at the calling object inside dlopen, it is generally used for a few other purposes also, including RUNPATH/RPATH handling, lookup of certain flags, determination whether the calling object is an audit object, etc. The RUNPATH/RPATH handling is usually the one that users complain about, but of course the remaining uses could also introduce hard-to-diagnose issues. Implementations that do not look at the caller in dlopen, generally use the main executable for all of these queries. Illumos also appears to have caller-dependence in `dlclose`, `dlerror` and `dlinfo`. I assume this is because lookup of this information is per-namespace, but I did not look into it too closely. # Proposed API The proposal here (previously made independently by other people in various forums) is to add new variants of the caller-dependent dlfcn functions that take an explicit `dl_caller` pointer that is used in place of the return address, e.g. for dlsym: ``` #include <dlfcn.h> void *dlsym_caller(void *restrict handle, const char *restrict symbol, void *restrict dl_caller); ``` Naturally there would be a `dlvsym_caller` for libcs that provide the `dlvsym` extension (and analogously for e.g. `dlfunc` on FreeBSD). For `dlopen`, since not all implementations have caller dependence, my proposal would be to not have `dlopen_from`, but instead only provide `dlmopen_from` (since caller-dependence, seems to be pretty closely tied to the dlmopen extension): ``` #include <dlfcn.h> void *dlmopen_caller(Lmid_t lmid, const char *restrict filename, int flags, void *restrict dl_caller); ``` In order to ensure that the dlopen behavior can be emulated without with this function, I would propose promoting `LM_ID_CALLER` to an exported flag (glibc already has an internal version of this): ``` LM_ID_CALLER Load the shared object in the namespace of the calling object (determined implicitly by `dlmopen` or explicitly from the `dl_caller` argument to `dlmopen_caller`). ``` # Next steps I'm hoping this overview was useful as a discussion of the problem I'm hoping to address and the current state of implementation. I'm not wedded to the specifics of the proposal, so suggestions for different names or semantics would be appreciated. I am particularly interested to know if there are additional complications in one implementation or another that I failed to pick up on in my survey above. Otherwise, assuming that people generally like this proposal, I would hope to be able to implement this in short order. I think in most implementations, this is simply a matter of adding the appropriate symbols as the functionality already exists. I recognize that it will probably take 10 years before this has propagated enough to be widely available to end users, but on the other hand, people have been complaining about this for the better part of 10 years, so if we'd fixed it at the time, we'd already be done - better late than never ;). Cheers, Keno [1] https://bugs.llvm.org/show_bug.cgi?id=27790 [2] https://sourceware.org/bugzilla/show_bug.cgi?id=27504 [3] https://sourceware.org/bugzilla/show_bug.cgi?id=25114 [4] https://sourceware.org/bugzilla/show_bug.cgi?id=28008 [5] https://sourceware.org/bugzilla/show_bug.cgi?id=28927 [6] https://github.com/google/sanitizers/issues/1219 [7] https://bugzilla.redhat.com/show_bug.cgi?id=1449604 [8] https://github.com/rr-debugger/rr/blob/master/src/preload/overrides.c#L136-L143 [9] https://github.com/llvm/llvm-project/blob/8e7acb670b3830a2c72ed2a47b93f88be971eed2/compiler-rt/lib/msan/msan_interceptors.cpp#L1332-L1337
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.