Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 7 Feb 2024 10:09:27 -0800
From: Colin Cross <ccross@...gle.com>
To: musl@...ts.openwall.com
Cc: Markus Mayer <mmayer@...adcom.com>
Subject: Re: [PATCH 0/1] ldso: continue searching if wrong architecture
 is found

On Tue, Feb 6, 2024 at 5:23 PM Markus Mayer <mmayer@...adcom.com> wrote:
>
> Hi all,
>
> We have just discovered an interesting issue after recently
> transitioning from glibc to musl-libc.
>
> This is an Aarch64 platform with 64-bit userland that also supports
> running 32-bit applications. We are using GCC 12.3 and musl 1.2.3.
>
> Some application teams are still developing their apps as 32-bit
> applications. As mentioned, the base system is 64-bit. To ensure all app
> dependencies are met, all shared libraries the app needs are bundled
> with the app in a tar-ball by the application developers. The launch
> script of the app will set LD_LIBRARY_PATH to the directory containing
> these 32-bit libraries before calling the actual application. This way,
> the app can find all its shared libraries. Some of these 32-bit
> libraries are "standard" libraries (libz, libcrypto) that also exist as
> 64-bit version in /usr/lib.
>
> What we have discovered is that 64-bit applications launched by the
> 32-bit app will fail due to a shared library mismatch. The 32-bit app
> needs to call system utilities, which are 64-bit. So, this needs to
> work.
>
> It took some digging, but I think I know what is going on. Let me
> summarize how to reproduce the problem, why it occurs and, hopefully,
> what can be done about it.
>
> The ingredients:
>
> * A 64-bit system
> * A directory containing 32-bit versions of "standard" libraries also
>   present on the system as 64-bit versions (libz, libcrypto, etc.)
> * LD_LIBRARY_PATH pointing to the custom 32-bit library directory
> * Attempting to launch a 64-bit application that uses one of the
>   libraries that exist as 32-bit and 64-bit version
>
> The problem:
>
> If LD_LIBRARY_PATH is set to a directory containing 32-bit libraries and
> then a 64-bit binary is invoked, the shared library loader will pick up
> the 32-bit version of a library first, because it'll look at
> LD_LIBRARY_PATH before anything else. Mapping the 32-bit library into
> the 64-bit process will fail. This much is expected.
>
> However, even though the correct library resides on the system, the
> shared library loader never attempts to look for it. The 64-bit process
> will fail to launch, even though there is no reason for the failure. The
> problem only exists, because the shared library launcher doesn't look in
> the remaining shared library directories.
>
> The solution:
>
> The shared library loader needs to keep searching the rest of the
> library search path if the library it found in LD_LIBRARY_PATH could not
> be mapped. If the library loader does this, everything will work fine as
> long as the library resides on the system in a well known path.
>
> How to reproduce:
>
> The problem can be simulated easily in the shell as follows.
>
> 1) Baseline: call /sbin/lsmod normally. Everything is working.
>
> $ /sbin/lsmod
> Module                  Size  Used by
> bdc                    53248  0
> udc_core               49152  1 bdc
>
> 2) Set LD_LIBRARY_PATH to the 32-bit directory and try again.
>
> $ LD_LIBRARY_PATH=/path/to/app /sbin/lsmod
> Error loading shared library libz.so.1: Exec format error (needed by /sbin/lsmod)
> Error loading shared library libcrypto.so.3: Exec format error (needed by /sbin/lsmod)
>
> Suddenly the 64-bit binary fails to run, because the copies of libz and
> libcrypto the shared library loader finds are the 32-bit versions
> residing in the app directory. It never tries looking in /usr/lib.
>
> Potential solution (the proposed patch):
>
> # LD_LIBRARY_PATH=/path/to/app /path/to/custom/libc.so /sbin/lsmod
> Module                  Size  Used by
> bdc                    53248  0
> udc_core               49152  1 bdc
>
> With the attached patch applied, everything is working again. Here, I am
> still setting LD_LIBRARY_PATH to the offending directory, but then I am
> using a patched version of musl-libc to launch /sbin/lsmod. This version
> of libc.so contains the patch, so it *will* search the rest of the
> system directories after discovering that the 32-bit versions of libz
> and libcrypto didn't work out. So, /sbin/lsmod is able to run fine.
>
> We can confirm this using strace:
>
> # LD_LIBRARY_PATH=. /path/to/custom/libc.so \
>     /usr/bin/strace -e openat /path/to/custom/libc.so --list /sbin/lsmod
> openat(AT_FDCWD, "/sbin/lsmod", O_RDONLY|O_LARGEFILE) = 3
>         /lib/ld-musl-aarch64.so.1 (0x7fb72b1000)
> openat(AT_FDCWD, "./liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/usr/local/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
>         liblzma.so.5 => /usr/lib/liblzma.so.5 (0x7fb7265000)
> openat(AT_FDCWD, "./libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> openat(AT_FDCWD, "/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/usr/local/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/usr/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
>         libz.so.1 => /usr/lib/libz.so.1 (0x7fb7251000)
> openat(AT_FDCWD, "./libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> openat(AT_FDCWD, "/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/usr/local/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/usr/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
>         libcrypto.so.3 => /usr/lib/libcrypto.so.3 (0x7fb6e92000)
>         libc.so => /lib/ld-musl-aarch64.so.1 (0x7fb72b1000)
>
> This call to strace is invoking the patched libc.so twice. Once, so
> strace itself won't fail, and once to launch /sbin/lsmod. We can see it
> finds the 32-bit versions of a number of libraries, but then keeps
> searching, once it finds it is unable to map them. In the end, it finds
> the proper libraries, and everything is working.
>
> Conversely, the unpatched libc.so will not try any other location
> besides the one pointed to by LD_LIBRARY_PATH, if library it is looking
> for exists in LD_LIBRARY_PATH. (It'll find the proper liblzma, because
> that does NOT exist as 32-bit version in LD_LIBRARY_PATH. But libz and
> libcrypto do, and that's where it fails.)
>
> openat(AT_FDCWD, "/sbin/lsmod", O_RDONLY|O_LARGEFILE) = 3
> openat(AT_FDCWD, "./liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/usr/local/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> openat(AT_FDCWD, "./libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> openat(AT_FDCWD, "./libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> +++ exited with 127 +++
>
> My proposal is, of course, only one possibility of many to achieve the
> goal of searching the system library locations after mapping a library
> from LD_LIBRARY_PATH fails.
>
> What is your take? Does the idea make sense in principal? Does my patch
> make sense?
>
> Continuing to search the system directories does seem to be the right
> thing to do under the circumstances described here. Also, it is what
> glibc does.
>
> Regards,
> -Markus
>
> Markus Mayer (1):
>   ldso: continue searching if wrong architecture is found first
>
>  ldso/dynlink.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> --
> 2.43.0
>

There is a previous discussion of the same issue at
https://www.openwall.com/lists/musl/2023/02/07/3.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.