Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210717132543.GG13220@brightrain.aerifal.cx>
Date: Sat, 17 Jul 2021 09:25:44 -0400
From: Rich Felker <dalias@...c.org>
To: Timo Teras <timo.teras@....fi>
Cc: musl@...ts.openwall.com
Subject: Re: option to enable eh_frame

On Sat, Jul 17, 2021 at 10:33:38AM +0300, Timo Teras wrote:
> On Fri, 16 Jul 2021 11:57:35 -0400
> Rich Felker <dalias@...c.org> wrote:
> 
> > On Fri, Jul 16, 2021 at 12:16:25PM +0300, Timo Teras wrote:
> > 
> > The debugger already can do debugging/unwinding because it has access
> > to the debug information (if you've installed it) and there is a
> > clear, understood-by-users contract that this information is not an
> > inherent part of the program but something optional for external
> > debugging tools only.
> 
> This apparently the fundamental difference in thinking. The debugging
> information is often huge, and people don't want to install it on
> production or embedded devices. It ships ton of other features which
> are not needed. However, the unwind/backtrace feature does provide
> significant value for error reports and profiling.
> 
> > > > Arguments to have .eh_frame:
> > >  - It allows debugging things even if musl-dbg is not or cannot be
> > >    installed for some reason (e.g. gdb, valgrind), or is no longer
> > >    available
> > >  - libunwind/libexecinfo will start to work and give meaningful
> > >    backtraces  
> > 
> > This is explicitly a reason not to. backtrace() considered harmful.
> 
> Please elaborate. I agree it's harmful from SEGV handler when program
> might be targeted. But used in other situations where it is useful. Do
> you have additional reasons why backtrace() is harmful?

That's the main one in which it's explicitly *harmful*, but in
general, introspection is not part of the C language and puts a lot of
constraints on implementation if you want to programs to be able to do
it in a meaningful way. This has never been something musl intended to
support, for that reason.

Note that we're even cautious (or regretful) over leaky abstractions
like RTLD_NEXT which fails to work right if you happen to call dlsym
in a tail call position.

> Do note that libunwind --enable-debug-frame is enabled default by
> upstream package on ARM and aarch64, and this would work if the debug
> package is installed.
> 
> This gives inconsistent behaviour based on arch and whether -dbg is
> installed. Developer might think the unwind stuff works, only to find
> it breaks on different install, and make the application depend on -dbg.

This is an important distinction and why it should be in -dbg.
"Application depending on.." is not functionality musl supports.
"Application interfacing with a debugging tool that's loading it's own
-dbg files" is visibly something different.

> The .debug_frame in main is a possibility, but no one else does it.
> Well, apparently some ship the full debug build always to achieve some
> of the same things. I think this also achieves the same divergence from
> the normal expectation.

We have a lot of divergence from "normal expectations" when normal
expectations come from questionable things GNU projects pushed.

> As small side note, the -fno-*unwind-tables flags are also in "tryflags"
> so if compiler would not support it, you don't get it. So depending on
> compiler you might still get .eh_frame ;)

In practice I'm not aware of any that support generation at all but
don't honor it.

However, more to the point, C is a language with undefined behavior.
There is no way to enforce that the tooling prevent applications from
doing broken things. Rather there's only avoiding explicitly doing
things to give the impression that something works when it's not
intended to.

"It's a fluke and that happened to work for your particular
environment only" is very different from "we actively made a change to
make that work". The latter comes with some implied obligation to
support it. For example if we changed freeaddrinfo to support freeing
a "null sublist" (which we arguably should; see past discussion), that
would entail an obligation not to revert it. But if the null pointer
deref present now just happened not to crash on some arch, or if we
just made a random change where the deref no longer happened in the
code path, that would not entail an obligation to keep it "working".

> The .eh_frame is a ELF abi feature normally turned on, and usually
> needs good reasons to turn it off. Could you Rich please explain in
> detail all the reasons why you think it's so evil? Mostly I heard
> "feature X is evil and people should not use it" and I find that
> your personal opinion, not a valid technical reason. But listening
> more and earlier things, I've heard:
> 
> 1. Having it makes it seem an application can jump through C library
>    from e.g. qsort callback. Not having .eh_frame makes C++ exception
>    fail, but does not help with e.g. setjmp/longjmp. So this is not
>    complete solution.

The same here. You seem to think this is an effort to make technical
measures to "block" doing something undefined. That's fundamentally
impossible, at least without a very different type of C
implementation.

> 2. Explicitly disabling backtrace() universally. I assume this is to
>    disable to widespread usage of it from SEGV handler. But there
>    seem to be also implementations reading .debug_frame, so this does
>    not really fix that either fully. And I think there are valid use
>    cases for backtrace() also, e.g. when having debug build and having
>    internal debug printing trigger it.

*If* this is the intent, that's more of a reason to have it load debug
packages to operate: so it's explicit that it's a debugging-only
operation and doesn't belong in production, and that you're using
features outside of what the implementation provides/supports.

> 3. Adds little bit if size to the binary. And also the runtime memory
>    usage as it goes to LOAD segment. But 80kB is not a concern on
>    modern environment even on embedded platforms.

There are embedded platforms with only 8MB or so, half or more
consumed by kernel, and entire fs in initramfs. I don't think this is
the most significant reason not to have it, but you're coming from a
very "large system" perspective here.

> 4. You want to still enforce this so people don't do 1 or 2.
>    For this, time has taught me, people need education, not enforcement.
>    If you don't teach them, they'll still go and do the same stupid
>    things even more stupid ways.
> 
> Please add in any reasons I may have missed. I would like to have your
> complete list of reasons why to remove .eh_frame. So far it has been
> coming out in pieces. I'd like constructive discussion if some of these
> items could be implemented stronger in other means than removing
> ..eh_frame.

You're coming at it from the wrong direction. For new, nonstandard
functionality requests, musl has a well established process of
criteria for inclusion and exclusion, based on invasiveness (this is
not just a matter of code change size, but of constraints it imposes),
size, how often the lack of the functionality impacts portable or
important FOSS programs users want to run on musl, and whether there
are other ways the applications that want it could achieve what they
want. In this case, all of those criteria indicate against doing it.

> But fundamentally I think if we ship .debug_frame, all people wanting
> do backtrace() and the silly stuff, will just enable .debug_frame
> support in their code and still do the silly stuff in worse way, than
> if .eh_frame was enabled. Since technically they are the same, even if
> the intended use case is different. As seen libunwind does have
> --enable-debug-frame. 

They might, but then they're clearly using a debugging interface that
only works when debug files are available (or if you include this in
main libc.so, when libc.so is not stripped).

> For me it's quite unusual to have .debug_frame but not other debug
> sections. I assume it works. But the tooling certain does not endorse
> this setup. I suppose it can be achieved with some objcopy magic, but
> requires care.

If tooling doesn't like .debug_frame in main library but additional
debug info split out, one way to do this would be providing separate
versions of the -dbg package, one that's .debug_frame only and one
that's detailed. I like this because there's an option not to install
it at all and it's clear that this is debug functionality. But either
way should be ok.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.