Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACg5n_P1HxMUqUrt9dF_QmHBuvwp9Sw5GhgXi5Un6CJDssC1dg@mail.gmail.com>
Date: Sat, 1 Apr 2023 22:57:09 -0400
From: Matt Wozniski <godlygeek@...il.com>
To: Matt Wozniski <godlygeek@...il.com>, musl@...ts.openwall.com
Cc: nsz@...t70.net
Subject: Re: Unwinding multithreaded musl applications with elfutils fails

On Fri, Mar 31, 2023 at 7:40 AM Szabolcs Nagy <nsz@...t70.net> wrote:
>
> * Matt Wozniski <godlygeek@...il.com> [2023-03-30 22:43:28 -0400]:
> > Using the elfutils eu-stack program or libdw's dwfl_getthread_frames
> > API to unwind multithreaded applications linked against musl libc on
> > x86-64 fails, getting stuck on `__clone`:
>
> musl has limited cfi debug info support (target specific), likely the
> unwinder needs a
>
>   .cfi_undefined rip
>
> in the clone start function to know where the stack frames end.
...
> musl supports building things without any cfi debug info since c
> does not require unwind support, but linux systems nowadays assume
> unwind tables are part of the platform abi so musl based distros
> should probably include it.
...
> musl does not guarantee frame-pointers either

So, if I understand what you're saying correctly: musl itself doesn't
guarantee the ability to unwind through it at all (neither using DWARF
unwind tables nor using frame pointers), but musl based distros like
Alpine ought to include proper unwind tables. Does that mean that you
don't consider the lack of CFI for `__clone` a defect in musl, but
that it's still worth reporting to the Alpine musl maintainers as a
defect in Alpine's musl build?

If so, what would distro maintainers have to do in order to remedy
that defect? Would it be patches to the (target specific) `clone.s` to
add appropriate CFI when building musl for the distro?

> (it could figure out the end with the same heuristic that gdb uses,
> but apparently elfutils is not smart enough).
>
> some backtracers may want cleared frame-pointer (rbp=0) to detect
> the end.
...
> rbp=0 may be the reason why backtrace in the main thread works, so it
> may be enough to do that in threads too.

And it sounds like both of these are workarounds that elfutils might
be able to pursue in the absence of correct unwind information built
into musl itself. Thanks, that gives a useful direction to dig in.

Thanks for the reply!

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.