Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200727215001.GO6949@brightrain.aerifal.cx>
Date: Mon, 27 Jul 2020 17:50:01 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: friendly errors for ABI mismatch

On Mon, Jul 27, 2020 at 02:57:10PM -0600, Ariadne Conill wrote:
> Hello,
> 
> On Monday, July 27, 2020 10:03:30 AM MDT Rich Felker wrote:
> > On Mon, Jul 27, 2020 at 09:27:28AM -0600, Ariadne Conill wrote:
> > > Hello,
> > > 
> > > On 32-bit systems, musl 1.2 has a new ABI (due to time64).  This results
> > > in
> > > programs built against musl 1.2 failing to run against musl 1.1.  That
> > > part is fine, but you get an error message about being unable to relocate
> > > symbols, which is not really insightful if you don't know about the ABI
> > > break.
> > > 
> > > glibc, on the other hand, has a minimum version specified in every binary,
> > > and prints an error message saying the glibc is too old if this situation
> > > is encountered.
> > > 
> > > I think we should add this feature to musl, so that in the future if we
> > > have another ABI break, users will be given useful advice about how to
> > > fix it.  Due to the relocation error message, a few Alpine contributors
> > > have been tripped up while trying to debug their work...
> > 
> > What you're seeing here is just a special case of the general property
> > that, if you've linked to a version of libc (or any library) that has
> > a new symbol and attempt to run with an older version, you'll get a
> > missing symbol error. It's very intentional (see libc comparison and
> > "forward compatibility") that we don't encode "minimum version number"
> > required anywhere. If you attempt to run with a library that has all
> > the symbols, it will run, subject to any bugs in the library version
> > you have and any functionality that returns with failure because it's
> > not supported in the version you have, etc.
> > 
> > There is no way to give a more high-level reason for the runtime link
> > failure like "your program needs time64 and you're running with an old
> > musl" because the code reporting the error *is the old musl* that's
> > not aware of whatever it is that the new binary is missing. Maybe you
> > have something else in mind that I don't fully understand, but
> > whatever it is it would only address future missing symbol errors, not
> > the ones you're seeing right now.
> 
> Simply what I have in mind is having friendly errors in the future, obviously 
> we cannot do it with time64.

I'm still not sure what that would look like. ELF dynamic linking,
modeling C static linking semantics, does not bind symbol resolution
to a particular library, so there's no way to know that an unresolved
symbol was "supposed to be defined in libc" and that this means your
libc.so is too old. All the dynamic linker can tell is that the
program being loaded needs the symbol to be defined and that it's not
defined in any libraries present. And that's what the existing error
message tells you.

Symbol versioning, if used, changes this somewhat by binding to a
particular version string (which by convention usually contains a
library name too) *if* the library used to resolve it at runtime has
versioning, but for very good reasons we have not used and do not want
to use symbol versioning. (In short, like here it's an "approximate
solution" for most things people want to use it for, doesn't actually
achieve those things precisely, messes other things up in the process,
and has really really bad tooling support.)

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.