|
Message-ID: <20240207012247.1121273-1-mmayer@broadcom.com> Date: Tue, 6 Feb 2024 17:22:42 -0800 From: Markus Mayer <mmayer@...adcom.com> To: Musl Mailing List <musl@...ts.openwall.com> Cc: Markus Mayer <mmayer@...adcom.com> Subject: [PATCH 0/1] ldso: continue searching if wrong architecture is found Hi all, We have just discovered an interesting issue after recently transitioning from glibc to musl-libc. This is an Aarch64 platform with 64-bit userland that also supports running 32-bit applications. We are using GCC 12.3 and musl 1.2.3. Some application teams are still developing their apps as 32-bit applications. As mentioned, the base system is 64-bit. To ensure all app dependencies are met, all shared libraries the app needs are bundled with the app in a tar-ball by the application developers. The launch script of the app will set LD_LIBRARY_PATH to the directory containing these 32-bit libraries before calling the actual application. This way, the app can find all its shared libraries. Some of these 32-bit libraries are "standard" libraries (libz, libcrypto) that also exist as 64-bit version in /usr/lib. What we have discovered is that 64-bit applications launched by the 32-bit app will fail due to a shared library mismatch. The 32-bit app needs to call system utilities, which are 64-bit. So, this needs to work. It took some digging, but I think I know what is going on. Let me summarize how to reproduce the problem, why it occurs and, hopefully, what can be done about it. The ingredients: * A 64-bit system * A directory containing 32-bit versions of "standard" libraries also present on the system as 64-bit versions (libz, libcrypto, etc.) * LD_LIBRARY_PATH pointing to the custom 32-bit library directory * Attempting to launch a 64-bit application that uses one of the libraries that exist as 32-bit and 64-bit version The problem: If LD_LIBRARY_PATH is set to a directory containing 32-bit libraries and then a 64-bit binary is invoked, the shared library loader will pick up the 32-bit version of a library first, because it'll look at LD_LIBRARY_PATH before anything else. Mapping the 32-bit library into the 64-bit process will fail. This much is expected. However, even though the correct library resides on the system, the shared library loader never attempts to look for it. The 64-bit process will fail to launch, even though there is no reason for the failure. The problem only exists, because the shared library launcher doesn't look in the remaining shared library directories. The solution: The shared library loader needs to keep searching the rest of the library search path if the library it found in LD_LIBRARY_PATH could not be mapped. If the library loader does this, everything will work fine as long as the library resides on the system in a well known path. How to reproduce: The problem can be simulated easily in the shell as follows. 1) Baseline: call /sbin/lsmod normally. Everything is working. $ /sbin/lsmod Module Size Used by bdc 53248 0 udc_core 49152 1 bdc 2) Set LD_LIBRARY_PATH to the 32-bit directory and try again. $ LD_LIBRARY_PATH=/path/to/app /sbin/lsmod Error loading shared library libz.so.1: Exec format error (needed by /sbin/lsmod) Error loading shared library libcrypto.so.3: Exec format error (needed by /sbin/lsmod) Suddenly the 64-bit binary fails to run, because the copies of libz and libcrypto the shared library loader finds are the 32-bit versions residing in the app directory. It never tries looking in /usr/lib. Potential solution (the proposed patch): # LD_LIBRARY_PATH=/path/to/app /path/to/custom/libc.so /sbin/lsmod Module Size Used by bdc 53248 0 udc_core 49152 1 bdc With the attached patch applied, everything is working again. Here, I am still setting LD_LIBRARY_PATH to the offending directory, but then I am using a patched version of musl-libc to launch /sbin/lsmod. This version of libc.so contains the patch, so it *will* search the rest of the system directories after discovering that the 32-bit versions of libz and libcrypto didn't work out. So, /sbin/lsmod is able to run fine. We can confirm this using strace: # LD_LIBRARY_PATH=. /path/to/custom/libc.so \ /usr/bin/strace -e openat /path/to/custom/libc.so --list /sbin/lsmod openat(AT_FDCWD, "/sbin/lsmod", O_RDONLY|O_LARGEFILE) = 3 /lib/ld-musl-aarch64.so.1 (0x7fb72b1000) openat(AT_FDCWD, "./liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/local/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 liblzma.so.5 => /usr/lib/liblzma.so.5 (0x7fb7265000) openat(AT_FDCWD, "./libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/local/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 libz.so.1 => /usr/lib/libz.so.1 (0x7fb7251000) openat(AT_FDCWD, "./libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/local/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 libcrypto.so.3 => /usr/lib/libcrypto.so.3 (0x7fb6e92000) libc.so => /lib/ld-musl-aarch64.so.1 (0x7fb72b1000) This call to strace is invoking the patched libc.so twice. Once, so strace itself won't fail, and once to launch /sbin/lsmod. We can see it finds the 32-bit versions of a number of libraries, but then keeps searching, once it finds it is unable to map them. In the end, it finds the proper libraries, and everything is working. Conversely, the unpatched libc.so will not try any other location besides the one pointed to by LD_LIBRARY_PATH, if library it is looking for exists in LD_LIBRARY_PATH. (It'll find the proper liblzma, because that does NOT exist as 32-bit version in LD_LIBRARY_PATH. But libz and libcrypto do, and that's where it fails.) openat(AT_FDCWD, "/sbin/lsmod", O_RDONLY|O_LARGEFILE) = 3 openat(AT_FDCWD, "./liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/local/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 openat(AT_FDCWD, "./libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 openat(AT_FDCWD, "./libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 +++ exited with 127 +++ My proposal is, of course, only one possibility of many to achieve the goal of searching the system library locations after mapping a library from LD_LIBRARY_PATH fails. What is your take? Does the idea make sense in principal? Does my patch make sense? Continuing to search the system directories does seem to be the right thing to do under the circumstances described here. Also, it is what glibc does. Regards, -Markus Markus Mayer (1): ldso: continue searching if wrong architecture is found first ldso/dynlink.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) -- 2.43.0
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.