|
Message-ID: <CAMbhsRTaOwWuFss_ncpOnphQ1--8OT5H9=w-4e07gFcZhi4uug@mail.gmail.com> Date: Wed, 7 Feb 2024 10:09:27 -0800 From: Colin Cross <ccross@...gle.com> To: musl@...ts.openwall.com Cc: Markus Mayer <mmayer@...adcom.com> Subject: Re: [PATCH 0/1] ldso: continue searching if wrong architecture is found On Tue, Feb 6, 2024 at 5:23 PM Markus Mayer <mmayer@...adcom.com> wrote: > > Hi all, > > We have just discovered an interesting issue after recently > transitioning from glibc to musl-libc. > > This is an Aarch64 platform with 64-bit userland that also supports > running 32-bit applications. We are using GCC 12.3 and musl 1.2.3. > > Some application teams are still developing their apps as 32-bit > applications. As mentioned, the base system is 64-bit. To ensure all app > dependencies are met, all shared libraries the app needs are bundled > with the app in a tar-ball by the application developers. The launch > script of the app will set LD_LIBRARY_PATH to the directory containing > these 32-bit libraries before calling the actual application. This way, > the app can find all its shared libraries. Some of these 32-bit > libraries are "standard" libraries (libz, libcrypto) that also exist as > 64-bit version in /usr/lib. > > What we have discovered is that 64-bit applications launched by the > 32-bit app will fail due to a shared library mismatch. The 32-bit app > needs to call system utilities, which are 64-bit. So, this needs to > work. > > It took some digging, but I think I know what is going on. Let me > summarize how to reproduce the problem, why it occurs and, hopefully, > what can be done about it. > > The ingredients: > > * A 64-bit system > * A directory containing 32-bit versions of "standard" libraries also > present on the system as 64-bit versions (libz, libcrypto, etc.) > * LD_LIBRARY_PATH pointing to the custom 32-bit library directory > * Attempting to launch a 64-bit application that uses one of the > libraries that exist as 32-bit and 64-bit version > > The problem: > > If LD_LIBRARY_PATH is set to a directory containing 32-bit libraries and > then a 64-bit binary is invoked, the shared library loader will pick up > the 32-bit version of a library first, because it'll look at > LD_LIBRARY_PATH before anything else. Mapping the 32-bit library into > the 64-bit process will fail. This much is expected. > > However, even though the correct library resides on the system, the > shared library loader never attempts to look for it. The 64-bit process > will fail to launch, even though there is no reason for the failure. The > problem only exists, because the shared library launcher doesn't look in > the remaining shared library directories. > > The solution: > > The shared library loader needs to keep searching the rest of the > library search path if the library it found in LD_LIBRARY_PATH could not > be mapped. If the library loader does this, everything will work fine as > long as the library resides on the system in a well known path. > > How to reproduce: > > The problem can be simulated easily in the shell as follows. > > 1) Baseline: call /sbin/lsmod normally. Everything is working. > > $ /sbin/lsmod > Module Size Used by > bdc 53248 0 > udc_core 49152 1 bdc > > 2) Set LD_LIBRARY_PATH to the 32-bit directory and try again. > > $ LD_LIBRARY_PATH=/path/to/app /sbin/lsmod > Error loading shared library libz.so.1: Exec format error (needed by /sbin/lsmod) > Error loading shared library libcrypto.so.3: Exec format error (needed by /sbin/lsmod) > > Suddenly the 64-bit binary fails to run, because the copies of libz and > libcrypto the shared library loader finds are the 32-bit versions > residing in the app directory. It never tries looking in /usr/lib. > > Potential solution (the proposed patch): > > # LD_LIBRARY_PATH=/path/to/app /path/to/custom/libc.so /sbin/lsmod > Module Size Used by > bdc 53248 0 > udc_core 49152 1 bdc > > With the attached patch applied, everything is working again. Here, I am > still setting LD_LIBRARY_PATH to the offending directory, but then I am > using a patched version of musl-libc to launch /sbin/lsmod. This version > of libc.so contains the patch, so it *will* search the rest of the > system directories after discovering that the 32-bit versions of libz > and libcrypto didn't work out. So, /sbin/lsmod is able to run fine. > > We can confirm this using strace: > > # LD_LIBRARY_PATH=. /path/to/custom/libc.so \ > /usr/bin/strace -e openat /path/to/custom/libc.so --list /sbin/lsmod > openat(AT_FDCWD, "/sbin/lsmod", O_RDONLY|O_LARGEFILE) = 3 > /lib/ld-musl-aarch64.so.1 (0x7fb72b1000) > openat(AT_FDCWD, "./liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/usr/local/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > liblzma.so.5 => /usr/lib/liblzma.so.5 (0x7fb7265000) > openat(AT_FDCWD, "./libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > openat(AT_FDCWD, "/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/usr/local/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/usr/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > libz.so.1 => /usr/lib/libz.so.1 (0x7fb7251000) > openat(AT_FDCWD, "./libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > openat(AT_FDCWD, "/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/usr/local/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/usr/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > libcrypto.so.3 => /usr/lib/libcrypto.so.3 (0x7fb6e92000) > libc.so => /lib/ld-musl-aarch64.so.1 (0x7fb72b1000) > > This call to strace is invoking the patched libc.so twice. Once, so > strace itself won't fail, and once to launch /sbin/lsmod. We can see it > finds the 32-bit versions of a number of libraries, but then keeps > searching, once it finds it is unable to map them. In the end, it finds > the proper libraries, and everything is working. > > Conversely, the unpatched libc.so will not try any other location > besides the one pointed to by LD_LIBRARY_PATH, if library it is looking > for exists in LD_LIBRARY_PATH. (It'll find the proper liblzma, because > that does NOT exist as 32-bit version in LD_LIBRARY_PATH. But libz and > libcrypto do, and that's where it fails.) > > openat(AT_FDCWD, "/sbin/lsmod", O_RDONLY|O_LARGEFILE) = 3 > openat(AT_FDCWD, "./liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/usr/local/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > openat(AT_FDCWD, "./libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > openat(AT_FDCWD, "./libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > +++ exited with 127 +++ > > My proposal is, of course, only one possibility of many to achieve the > goal of searching the system library locations after mapping a library > from LD_LIBRARY_PATH fails. > > What is your take? Does the idea make sense in principal? Does my patch > make sense? > > Continuing to search the system directories does seem to be the right > thing to do under the circumstances described here. Also, it is what > glibc does. > > Regards, > -Markus > > Markus Mayer (1): > ldso: continue searching if wrong architecture is found first > > ldso/dynlink.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > -- > 2.43.0 > There is a previous discussion of the same issue at https://www.openwall.com/lists/musl/2023/02/07/3.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.