Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231016215307.GE1427497@port70.net>
Date: Mon, 16 Oct 2023 23:53:07 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: Rich Felker <dalias@...c.org>
Cc: Farid Zakaria <fmzakari@...c.edu>, musl@...ts.openwall.com
Subject: Re: Getting access to section data during dynlink.c

* Rich Felker <dalias@...c.org> [2023-10-16 10:26:04 -0400]:
> On Sun, Oct 15, 2023 at 06:06:48PM -0700, Farid Zakaria wrote:
> > Hi!
> > 
> > I'd like to read some section data during dynlink.c
> > Does anyone have any good suggestions on the best way to do so?
> > I believe most ELF files ask for the load to start from the start of the
> > ELF file.
> > 
> > I see in dynlink.c the kernel sends AT_PHDR as an auxiliary vector --
> > Should I try applying a fixed offset from it to get to the start of the
> > ehdr ?
> > 
> > Any advice is appreciated.
> > 
> > Please include me in the CC for the reply.
> > I can't recall if I've subscribed.
> 
> Neither the Ehdrs nor sections are "loadable" parts of an executable
> ELF file. They may happen to be present in the mapped pages due to
> page granularity of mappings, but that doesn't mean they're guaranteed
> to be there; the Ehdrs are for the program loader's use, and the
> sections are for the use of linker (non-dynamic), debugger, etc.
> 
> In musl we use Ehdrs in a couple places: the dynamic linker finds its
> own program headers via assuming they're mapped, but this is rather
> reasonable since we built it and it's either going to always-succeed
> or always-fail and get caught before deployment if that build-time
> assumption somehow isn't met. It's not contingent on properties of a
> program encountered at runtime. We also use Ehdrs when loading a
> program (invoking ldso as a command) or shared library, but in that
> case we are the loaded and have access to them via the file being
> loaded.
> 
> Depending on what you want to do, and whether you just need to be
> compatible with your own binaries or arbitrary ones, it may suffice to
> do some sort of hack like rounding down from the program header
> address to the start of the page and hoping the Ehdrs live there. But
> it might make sense to look for other ways to do what you're trying to
> do, without needing to access non-runtime data structures.

note that (not too old) bfd ld and lld defines a hidden linker symbol
__ehdr_start that at runtime resolves to where the ehdr is.

example:

#include <elf.h>
#include <stdio.h>

__attribute__((visibility("hidden"), weak)) extern char __ehdr_start[];

int main()
{
	if (__ehdr_start) {
		Elf64_Ehdr *ehdr = (void *)__ehdr_start;
		printf("ehdr %p\n", ehdr);
		Elf64_Phdr *phdr = (void *)(__ehdr_start + ehdr->e_phoff);
		printf("phdr %p\n", phdr);
	} else
		printf("__ehdr_start is undefined\n");

	// to compare against the actual mappings
	char buf[9999];
	FILE *f = fopen("/proc/self/maps","r");
	size_t n = fread(buf, 1, sizeof buf, f);
	fwrite(buf, 1, n, stdout);
}

this should work for 64bit elf exe if ehdr is mapped into memory.

if you want link time error on an old linker instead of 0 __ehdr_start,
then just drop "weak" and the runtime check. (the code as written assumes
ehdr is not at exact 0 address, which is guaranteed by usual linux setups)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.