|
Message-ID: <20240723235911.GA10433@brightrain.aerifal.cx> Date: Tue, 23 Jul 2024 19:59:11 -0400 From: Rich Felker <dalias@...c.org> To: Alex Rønne Petersen <alex@...xrp.com> Cc: musl@...ts.openwall.com Subject: Re: Stack pointer is misaligned when invoking the musl dynamic linker directly to run a program without start files On Wed, Jul 24, 2024 at 01:42:28AM +0200, Alex Rønne Petersen wrote: > On Wed, Jul 24, 2024 at 1:07 AM Rich Felker <dalias@...c.org> wrote: > > > > On Wed, Jul 24, 2024 at 12:55:18AM +0200, Alex Rønne Petersen wrote: > > > On Wed, Jul 24, 2024 at 12:46 AM Rich Felker <dalias@...c.org> wrote: > > > > > > > > On Tue, Jul 23, 2024 at 11:42:51PM +0200, Alex Rønne Petersen wrote: > > > > > Hi, > > > > > > > > > > Repro: > > > > > > > > > > $ cat test.s > > > > > .global _start > > > > > _start: > > > > > mov %rsp, %rdi > > > > > and $15, %rdi > > > > > call exit > > > > > $ musl-gcc test.s -nostartfiles > > > > > $ ./a.out; echo $? > > > > > 0 > > > > > $ /lib64/ld-linux-x86-64.so.2 ./a.out; echo $? > > > > > 0 > > > > > $ /lib/ld-musl-x86_64.so.1 ./a.out; echo $? > > > > > 8 > > > > > $ /lib/ld-musl-x86_64.so.1 --version > > > > > musl libc (x86_64) > > > > > Version 1.2.3 > > > > > > > > > > I could well be missing something here, but at first glance, this > > > > > *seems* like an ABI violation; the x86-64 psABI [0] states in §3.4..1 > > > > > that RSP is guaranteed to be 16-byte aligned on process entry. The > > > > > same is true of many other architectures (though the amount obviously > > > > > differs). > > > > > > > > > > I suppose it's debatable whether a program interpreter ought to be > > > > > required to uphold the same guarantees as the kernel on process > > > > > initialization? > > > > > > > > > > [0] https://gitlab.com/x86-psABIs/x86-64-ABI > > > > > > > > This is intentional. _start is not a C function subject to psABI > > > > calling convention. It's an entry point with its own convention that > > > > the stack pointer register point at the start of the ELF argument > > > > packing, and that has a requirement to align the stack before calling > > > > into C code. > > > > > > Can you elaborate on this point? > > > > > > To be clear on my end, I'm not suggesting that `_start` is a normal C > > > function and should be subject to the calling convention. I'm > > > specifically referring to §3.4.1 which deals with register state upon > > > process initialization, i.e. when control is transferred to the ELF > > > entry point by the kernel (or dynamic linker, in this case). This is > > > the part I believe musl is in contravention of. > > > > OK, that's just not something we claim to conform to though. 3.4.1 is > > documenting a contract between the kernel and the userspace runtime > > (e_entry point of application or dynamic linker). Not the internal > > contract between one part of musl (crt1) and another (ldso). > > Yeah, that's the part I thought might be debatable. > > The ABI is not clear on whether the program interpreter loading the > program and transferring control to its entry point constitutes > "process initialization". If I felt strongly about it, I would > probably argue that it does as a practical matter, but given the level > of precision usually seen in this particular document, I think it's > also entirely reasonable to take the opposite position. > > In any case, I think just having this officially on record as > not-a-bug is fine. :) > > > > > The psABI document does not cover enough requirements to be able to > > portably write your own custom replacement for crt1.o -- for example, > > that it call __libc_start_main, or the contract for how > > __libc_start_main is called. So if you want to do that, you need to be > > aware of libc-specific requirements. And one of those, at least for > > now, is tolerating that the stack pointer might not be aligned > > (because it makes a lot more sense to just use the argument vector > > in-place rather than rewriting the whole thing). > > By the way, I'm not actually clear on *how* the stack pointer ends up > misaligned vs what it is when given by the kernel. Just for my > curiosity, would you be able to explain why? Sure. On entry, the stack pointer points to an array of words that look like: argc, argv0, argv1, argv2, ..., 0, env0, env1, ..., 0, [aux] In the case of "/lib/ld-musl-x86_64.so.1 ./a.out", these look like (from the kernel): 2, &"/lib/ld-musl-x86_64.so.1", &"./a.out", 0, ... In order to pass control to the application, ldso skips past the part of the command line that belongs to ldso, decrementing argc by the number of slots skipped, and writes the new argc there overwriting whatever pointer was there before: 2, 1, &"./a.out", 0, ... Then it passes execution to the main program entry point with the stack pointer pointing to the newly written argc slot. That slot's address is whatever it was before, so it will be 8 mod 16 if an odd number of slots were skipped. The only way this could be changed is by shifting the whole array with memmove. Normally that wouldn't cost much, but it would be at least slightly nontrivial time if you had a gigantic number of arguments or environment slots. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.