|
Message-ID: <CANmYxDvS2W2LwW=o62bcJmK+e3k5FMMQNXGozyJ0LH73Y6T5kQ@mail.gmail.com>
Date: Mon, 1 Feb 2021 08:55:08 -0800
From: Nick Bray <ncbray@...gle.com>
To: musl@...ts.openwall.com
Cc: Rich Felker <dalias@...c.org>
Subject: Re: Can’t build musl with lto=thin
Small warning:
When I statically linked in Musl and performed LTO on the entire executable
with Clang, I ran into a few bugs. Specifically, when Clang could rewrite
libc calls (memcmp => bcmp, for example) the rewritten call no longer had
any calling convention information in the IR. If LTO then tried to inline
the call, Clang got confused and emitted a UDF instead of an actual call.
I was working on aarch64. I believe the loss of calling convention is
architecture independent, but I don't know what the other backends do with
the malformed IR.
I was hesitant to mail the list because I _think_ there's a patch in flight
and it may have landed, but I have been too busy to follow up. I figured I
should at least mention the problems that I have seen, inlining across the
libc boundary.
On Mon, Feb 1, 2021 at 5:28 AM Jiahao XU <Jiahao_XU@...look.com> wrote:
> Thank you very much for helping me, RIch.
>
> Jiahao XU
> ------------------------------
> *From:* Jiahao XU <Jiahao_XU@...look.com>
> *Sent:* Monday, February 1, 2021 11:50:04 AM
> *To:* Rich Felker <dalias@...c.org>
> *Cc:* musl@...ts.openwall.com <musl@...ts.openwall.com>
> *Subject:* Re: [musl] Can’t build musl with lto=thin
>
> Correction: The right number should be the executable is cut from 47KB to
> 5.0KB.
>
> The previous numbers are retrieved using du -hs, which calculate disk
> usage, not file size.
>
> Jiahao XU
> ------------------------------
> *From:* Jiahao XU <Jiahao_XU@...look.com>
> *Sent:* Monday, February 1, 2021 11:44:49 AM
> *To:* Rich Felker <dalias@...c.org>
> *Cc:* musl@...ts.openwall.com <musl@...ts.openwall.com>
> *Subject:* Re: [musl] Can’t build musl with lto=thin
>
> Correction: The executable is cut from 48KB to 8KB.
>
> I mixed up the numbers from du and ls.
>
> Jiahao XU
> ------------------------------
> *From:* Jiahao XU <Jiahao_XU@...look.com>
> *Sent:* Monday, February 1, 2021 11:31:47 AM
> *To:* Rich Felker <dalias@...c.org>
> *Cc:* musl@...ts.openwall.com <musl@...ts.openwall.com>
> *Subject:* Re: [musl] Can’t build musl with lto=thin
>
> Interesting enough, I found —gc-section used along with -flto can cut the
> size of final hello_world executable from 48KB to 5KB.
>
> After investigating with bloaty, I found that —gc-section along with -flto
> is able to cut .text from 25.4 KiB to 3.04 KiB, and cut the .rodata from
> 19.5 KiB to 120 bytes.
> .data section however, seen an increase from 316 bytes to 372 bytes, but
> the VM size is cut from 252 to 244 bytes.
>
> $ bloaty gc-section-a.out -- no-gc-section.a.out
>
> FILE SIZE VM SIZE
>
> -------------- --------------
>
> +18% +56 -3.2% -8 .data
>
> [NEW] +6 [NEW] +6 [LOAD #2 [RX]]
>
> [DEL] -4 -66.7% -8 [LOAD #4 [RW]]
>
> -72.7% -8 [ = ] 0 [Unmapped]
>
> -32.0% -64 [ = ] 0 .comment
>
> -99.4% -19.4Ki -99.7% -19.4Ki .rodata
>
> -88.0% -22.3Ki -88.2% -22.3Ki .text
>
> -89.4% -41.8Ki -88.5% -41.8Ki TOTAL
>
>
> Jiahao XU
> ------------------------------
> *From:* Jiahao XU <Jiahao_XU@...look.com>
> *Sent:* Monday, February 1, 2021 11:22:21 AM
> *To:* Rich Felker <dalias@...c.org>
> *Cc:* musl@...ts.openwall.com <musl@...ts.openwall.com>
> *Subject:* Re: [musl] Can’t build musl with lto=thin
>
> I have finally succeded to produce a statically linked hello_world
> executable.
>
> I modified the last line of musl-clang-lld to:
>
> exec $($cc -print-prog-name=ld.lld) -nostdlib “$@“ -l:libc.a
> —no-dynamic-linker
>
> and everything works fine now.
>
> Jiahao XU
> ------------------------------
> *From:* Rich Felker <dalias@...c.org>
> *Sent:* Monday, February 1, 2021 8:01:06 AM
> *To:* Jiahao XU <Jiahao_XU@...look.com>
> *Cc:* musl@...ts.openwall.com <musl@...ts.openwall.com>
> *Subject:* Re: [musl] Can’t build musl with lto=thin
>
> On Sun, Jan 31, 2021 at 05:32:45AM +0000, Jiahao XU wrote:
> > I used `musl-clang -Oz -flto -s -fuse-ld=musl-clang-lld-static
> -Wl,—plugin-opt=O3,-O3 hello.c` to produce the executable.
>
> Where is -static? Normally it does *not* work to add -static just to
> the ld command line. The compiler driver has to know that it's
> requesting static linking because it will pass a different command
> line to the linker based on that.
>
> > Content of `/usr/local/musl/bin/ld.musl-clang-lld-static` is same as
> > the generated `ld.musl-clang`, except for the last line, which I
> > modified it to:
> >
> > exec $($cc -print-prog-name=ld.lld) -nostdlib “$@“ -static -lc
> -dynamic-linker “$ldso”
>
> Try moving -static out from here (i.e. using the script unmodified
> except for requesting ld.lld) and see if that works. Note that a
> correctly linked executable will not have any INTERP in readelf -a
> output, so as long as you see INTERP anywhere there you're doing
> something wrong.
>
> Rich
>
>
> > ________________________________
> > From: Rich Felker <dalias@...c.org>
> > Sent: Sunday, January 31, 2021 4:01:22 PM
> > To: Jiahao XU <Jiahao_XU@...look.com>
> > Cc: musl@...ts.openwall.com <musl@...ts.openwall.com>
> > Subject: Re: [musl] Can’t build musl with lto=thin
> >
> > On Sat, Jan 30, 2021 at 11:44:48PM +0000, Jiahao XU wrote:
> > > (gdb) bt
> > >
> > > #0 0x00007ffff7ff5498 in decode_vec () from /lib/ld-musl-x86_64.so.1
> > >
> > > #1 0x00007ffff7ff58cb in decode_dyn () from /lib/ld-musl-x86_64.so.1
> > >
> > > #2 0x0000000000000000 in ?? ()
> >
> > This is not a static-linked program. decode_dyn is part of the dynamic
> > linker. It looks to me like you've created some sort of weird hybrid
> > executable that's not valid. Can you show the command lines you used
> > to produce it?
> >
> > Also please keep list on CC when replying.
> >
> > Rich
> >
> >
> > > (gdb) info r
> > >
> > > rax 0x1 1
> > >
> > > rbx 0x7ffff7ffe2d8 140737354130136
> > >
> > > rcx 0x5000 20480
> > >
> > > rdx 0x200238 2097720
> > >
> > > rsi 0x7fffffffd8d0 140737488345296
> > >
> > > rdi 0x0 0
> > >
> > > rbp 0x7ffff7ffe2d8 0x7ffff7ffe2d8 <__dls3.app>
> > >
> > > rsp 0x7fffffffd8c8 0x7fffffffd8c8
> > >
> > > r8 0x0 0
> > >
> > > r9 0xfffffffffffff000 -4096
> > >
> > > r10 0x800000 8388608
> > >
> > > r11 0x200000 2097152
> > >
> > > r12 0x7fffffffdcb8 140737488346296
> > >
> > > r13 0x0 0
> > >
> > > r14 0x7fffffffd8d0 140737488345296
> > >
> > > r15 0x7fffffffdcb8 140737488346296
> > >
> > > rip 0x7ffff7ff5498 0x7ffff7ff5498 <decode_vec+21>
> > >
> > > eflags 0x10246 [ PF ZF IF RF ]
> > >
> > > cs 0x33 51
> > >
> > > ss 0x2b 43
> > >
> > > ds 0x0 0
> > >
> > > es 0x0 0
> > >
> > > fs 0x0 0
> > >
> > > gs 0x0 0
> > >
> > >
> > > Disassembly of decode_dyn:
> > >
> > >
> > > 0x00007ffff7ff58af <+0>: push %r14
> > >
> > > 0x00007ffff7ff58b1 <+2>: push %rbx
> > >
> > > 0x00007ffff7ff58b2 <+3>: sub $0x108,%rsp
> > >
> > > 0x00007ffff7ff58b9 <+10>: mov %rdi,%rbx
> > >
> > > 0x00007ffff7ff58bc <+13>: mov 0x10(%rdi),%rdi
> > >
> > > 0x00007ffff7ff58c0 <+17>: mov %rsp,%r14
> > >
> > > 0x00007ffff7ff58c3 <+20>: mov %r14,%rsi
> > >
> > > 0x00007ffff7ff58c6 <+23>: call 0x7ffff7ff5483 <decode_vec>
> > >
> > > 0x00007ffff7ff58cb <+28>: mov (%rbx),%rax
> > >
> > >
> > > Disassembly of decode_vec:
> > >
> > >
> > > 0x00007ffff7ff5483 <+0>: xor %eax,%eax
> > >
> > > 0x00007ffff7ff5485 <+2>: cmp $0x20,%rax
> > >
> > > 0x00007ffff7ff5489 <+6>: je 0x7ffff7ff5495 <decode_vec+18>
> > >
> > > 0x00007ffff7ff548b <+8>: andq $0x0,(%rsi,%rax,8)
> > >
> > > 0x00007ffff7ff5490 <+13>: inc %rax
> > >
> > > 0x00007ffff7ff5493 <+16>: jmp 0x7ffff7ff5485 <decode_vec+2>
> > >
> > > 0x00007ffff7ff5495 <+18>: push $0x1
> > >
> > > 0x00007ffff7ff5497 <+20>: pop %rax
> > >
> > > => 0x00007ffff7ff5498 <+21>: mov (%rdi),%rcx
> > >
> > > 0x00007ffff7ff549b <+24>: test %rcx,%rcx
> > >
> > > 0x00007ffff7ff549e <+27>: je 0x7ffff7ff54c3 <decode_vec+64>
> > >
> > > 0x00007ffff7ff54a0 <+29>: lea -0x1(%rcx),%rdx
> > >
> > > 0x00007ffff7ff54a4 <+33>: cmp $0x1e,%rdx
> > >
> > > 0x00007ffff7ff54a8 <+37>: ja 0x7ffff7ff54bd <decode_vec+58>
> > >
> > > 0x00007ffff7ff54aa <+39>: shlx %rcx,%rax,%rcx
> > >
> > > 0x00007ffff7ff54af <+44>: or %rcx,(%rsi)
> > >
> > > 0x00007ffff7ff54b2 <+47>: mov (%rdi),%rcx
> > >
> > > 0x00007ffff7ff54b5 <+50>: mov 0x8(%rdi),%rdx
> > >
> > > 0x00007ffff7ff54b9 <+54>: mov %rdx,(%rsi,%rcx,8)
> > >
> > > 0x00007ffff7ff54bd <+58>: add $0x10,%rdi
> > >
> > > 0x00007ffff7ff54c1 <+62>: jmp 0x7ffff7ff5498 <decode_vec+21>
> > >
> > > 0x00007ffff7ff54c3 <+64>: ret
> > >
> > >
> > > Jiahao XU
> > >
> > > Get Outlook for iOS<https://aka.ms/o0ukef>
> > > ________________________________
> > > From: Rich Felker <dalias@...c.org>
> > > Sent: Sunday, January 31, 2021 10:30:12 AM
> > > To: Jiahao XU <Jiahao_XU@...look.com>
> > > Cc: musl@...ts.openwall.com <musl@...ts.openwall.com>
> > > Subject: Re: [musl] Can’t build musl with lto=thin
> > >
> > > On Sat, Jan 30, 2021 at 11:04:32PM +0000, Jiahao XU wrote:
> > > > > So something like (in config.mak):
> > > > >
> > > > > obj/ldso/dlstart.lo: CFLAGS_ALL += -fno-lto
> > > >
> > > > Thanks, with this I was able to build libc.so successfully with
> clang and created a 3.5 KB hello world program using clang and lld.
> > > >
> > > > However, I still wasn’t able to statically linked with libc.
> > > >
> > > > Once I added ‘-static’ to the compiler flags, the executable failed
> with ‘Segmentation fault (core dumped)’.
> > >
> > > It's libc.a, not libc.so, that will be involved in making a
> > > static-linked binary. It's hard to know what's going wrong without
> > > more information. Can you run under a debugger and provide a
> > > backtrace, disassembly, and register dump for where the crash occurs?
> > >
> > > Rich
> > >
> > >
> > > > ________________________________
> > > > From: Rich Felker <dalias@...c.org>
> > > > Sent: Sunday, January 31, 2021 7:12:31 AM
> > > > To: Jiahao XU <Jiahao_XU@...look.com>
> > > > Cc: musl@...ts.openwall.com <musl@...ts.openwall.com>
> > > > Subject: Re: [musl] Can’t build musl with lto=thin
> > > >
> > > > On Fri, Jan 29, 2021 at 12:19:42PM +0000, Jiahao XU wrote:
> > > > > musl-1.2.2 compilation with clang-11 failed to build libc.so at
> the final linking stage:
> > > > >
> > > > > ld.lld: error: undefined hidden symbol: __dls2
> > > > > >>> referenced by ld-temp.o
> > > > > >>> lto.tmp:(_dlstart_c)
> > > > > >>> did you mean: __dls3
> > > > > >>> defined in: lto.tmp
> > > > >
> > > > > I am using CFLAGS=‘-march=native -mtune=native -Oz -flto
> > > > > -fmerge-all-constants -fomit-frame-pointer’ and LDFLAGS=‘-flto
> > > > ^^^^^^^^^^^^^^^^^^^^^
> > > >
> > > > The -fmerge-all-constants option gives non-conforming language
> > > > semantics and should not be used, but that's a separate issue.
> > > >
> > > > > -fuse-ld=lld -Wl,—plugin-opt=O3,-O3,—icf=safe’.
> > > >
> > > > > No configure option is supplied.
> > > >
> > > > Otherwise, it's a known issue that LTO misses references from asm
> > > > (both top-level and in functions). I think dlstart.lo and a few other
> > > > files should just be built with LTO disabled; any LTO-type
> > > > optimization in code that runs at this stage is inherently invalid,
> > > > anyway. So something like (in config.mak):
> > > >
> > > > obj/ldso/dlstart.lo: CFLAGS_ALL += -fno-lto
> > > >
> > > > Rich
>
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.