Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240207012247.1121273-1-mmayer@broadcom.com>
Date: Tue,  6 Feb 2024 17:22:42 -0800
From: Markus Mayer <mmayer@...adcom.com>
To: Musl Mailing List <musl@...ts.openwall.com>
Cc: Markus Mayer <mmayer@...adcom.com>
Subject: [PATCH 0/1] ldso: continue searching if wrong architecture is found

Hi all,

We have just discovered an interesting issue after recently
transitioning from glibc to musl-libc.

This is an Aarch64 platform with 64-bit userland that also supports
running 32-bit applications. We are using GCC 12.3 and musl 1.2.3.

Some application teams are still developing their apps as 32-bit
applications. As mentioned, the base system is 64-bit. To ensure all app
dependencies are met, all shared libraries the app needs are bundled
with the app in a tar-ball by the application developers. The launch
script of the app will set LD_LIBRARY_PATH to the directory containing
these 32-bit libraries before calling the actual application. This way,
the app can find all its shared libraries. Some of these 32-bit
libraries are "standard" libraries (libz, libcrypto) that also exist as
64-bit version in /usr/lib.

What we have discovered is that 64-bit applications launched by the
32-bit app will fail due to a shared library mismatch. The 32-bit app
needs to call system utilities, which are 64-bit. So, this needs to
work.

It took some digging, but I think I know what is going on. Let me
summarize how to reproduce the problem, why it occurs and, hopefully,
what can be done about it.

The ingredients:

* A 64-bit system
* A directory containing 32-bit versions of "standard" libraries also
  present on the system as 64-bit versions (libz, libcrypto, etc.)
* LD_LIBRARY_PATH pointing to the custom 32-bit library directory
* Attempting to launch a 64-bit application that uses one of the
  libraries that exist as 32-bit and 64-bit version

The problem:

If LD_LIBRARY_PATH is set to a directory containing 32-bit libraries and
then a 64-bit binary is invoked, the shared library loader will pick up
the 32-bit version of a library first, because it'll look at
LD_LIBRARY_PATH before anything else. Mapping the 32-bit library into
the 64-bit process will fail. This much is expected.

However, even though the correct library resides on the system, the
shared library loader never attempts to look for it. The 64-bit process
will fail to launch, even though there is no reason for the failure. The
problem only exists, because the shared library launcher doesn't look in
the remaining shared library directories.

The solution:

The shared library loader needs to keep searching the rest of the
library search path if the library it found in LD_LIBRARY_PATH could not
be mapped. If the library loader does this, everything will work fine as
long as the library resides on the system in a well known path.

How to reproduce:

The problem can be simulated easily in the shell as follows.

1) Baseline: call /sbin/lsmod normally. Everything is working.

$ /sbin/lsmod
Module                  Size  Used by
bdc                    53248  0
udc_core               49152  1 bdc

2) Set LD_LIBRARY_PATH to the 32-bit directory and try again.

$ LD_LIBRARY_PATH=/path/to/app /sbin/lsmod
Error loading shared library libz.so.1: Exec format error (needed by /sbin/lsmod)
Error loading shared library libcrypto.so.3: Exec format error (needed by /sbin/lsmod)

Suddenly the 64-bit binary fails to run, because the copies of libz and
libcrypto the shared library loader finds are the 32-bit versions
residing in the app directory. It never tries looking in /usr/lib.

Potential solution (the proposed patch):

# LD_LIBRARY_PATH=/path/to/app /path/to/custom/libc.so /sbin/lsmod
Module                  Size  Used by
bdc                    53248  0
udc_core               49152  1 bdc

With the attached patch applied, everything is working again. Here, I am
still setting LD_LIBRARY_PATH to the offending directory, but then I am
using a patched version of musl-libc to launch /sbin/lsmod. This version
of libc.so contains the patch, so it *will* search the rest of the
system directories after discovering that the 32-bit versions of libz
and libcrypto didn't work out. So, /sbin/lsmod is able to run fine.

We can confirm this using strace:

# LD_LIBRARY_PATH=. /path/to/custom/libc.so \
    /usr/bin/strace -e openat /path/to/custom/libc.so --list /sbin/lsmod
openat(AT_FDCWD, "/sbin/lsmod", O_RDONLY|O_LARGEFILE) = 3
	/lib/ld-musl-aarch64.so.1 (0x7fb72b1000)
openat(AT_FDCWD, "./liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/local/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
	liblzma.so.5 => /usr/lib/liblzma.so.5 (0x7fb7265000)
openat(AT_FDCWD, "./libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/local/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
	libz.so.1 => /usr/lib/libz.so.1 (0x7fb7251000)
openat(AT_FDCWD, "./libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/local/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
	libcrypto.so.3 => /usr/lib/libcrypto.so.3 (0x7fb6e92000)
	libc.so => /lib/ld-musl-aarch64.so.1 (0x7fb72b1000)

This call to strace is invoking the patched libc.so twice. Once, so
strace itself won't fail, and once to launch /sbin/lsmod. We can see it
finds the 32-bit versions of a number of libraries, but then keeps
searching, once it finds it is unable to map them. In the end, it finds
the proper libraries, and everything is working.

Conversely, the unpatched libc.so will not try any other location
besides the one pointed to by LD_LIBRARY_PATH, if library it is looking
for exists in LD_LIBRARY_PATH. (It'll find the proper liblzma, because
that does NOT exist as 32-bit version in LD_LIBRARY_PATH. But libz and
libcrypto do, and that's where it fails.)

openat(AT_FDCWD, "/sbin/lsmod", O_RDONLY|O_LARGEFILE) = 3
openat(AT_FDCWD, "./liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/local/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "./libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
openat(AT_FDCWD, "./libcrypto.so.3", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
+++ exited with 127 +++

My proposal is, of course, only one possibility of many to achieve the
goal of searching the system library locations after mapping a library
from LD_LIBRARY_PATH fails.

What is your take? Does the idea make sense in principal? Does my patch
make sense?

Continuing to search the system directories does seem to be the right
thing to do under the circumstances described here. Also, it is what
glibc does.

Regards,
-Markus

Markus Mayer (1):
  ldso: continue searching if wrong architecture is found first

 ldso/dynlink.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

-- 
2.43.0

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.