|
Message-ID: <20170216023935.GM1520@brightrain.aerifal.cx> Date: Wed, 15 Feb 2017 21:39:35 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Reviving planned ldso changes On Thu, Feb 16, 2017 at 02:58:24AM +0100, Szabolcs Nagy wrote: > * Szabolcs Nagy <nsz@...t70.net> [2017-01-04 11:51:09 +0100]: > > * Rich Felker <dalias@...c.org> [2017-01-03 00:43:51 -0500]: > > > One item that's been on the agenda for a long time (maybe 1.5 years > > > now?) is planned dynamic linker improvements. The thread "Further > > > dynamic linker optimizations" from summer 2015 covered parts but maybe > > > not all of what I had in mind. > .... > > - symbol lookup logic is duplicated between do_dlsym and find_sym, > > it would be nice to do that with common code path if possible. > > i looked into this and found some issues > > i attached my current patches for comments, but the bugs may be > fixed separately from the refactoring. OK. I don't mind fixing them together as long as it's documented in the commit log that we're fixing bugs. Yes it's nice to have independent fixes, but sometimes it's not worth it if doing so involves extra duplicate code. > >From 5dd896dcdf53e50fc2d7c1957d62df7b8930ff01 Mon Sep 17 00:00:00 2001 > From: Szabolcs Nagy <nsz@...t70.net> > Date: Sat, 3 Dec 2016 20:52:43 +0000 > Subject: [PATCH 1/2] treat STB_WEAK and STB_GNU_UNIQUE like STB_GLOBAL in > find_sym > > A weak symbol definition is not special during dynamic linking, so > don't let a strong definition in a later module override it. > (glibc dynamic linker allows overriding weak definitions if > LD_DYNAMIC_WEAK is set, musl does not.) > > STB_GNU_UNIQUE means that the symbol is global, even if it is in a > module that's loaded with RTLD_LOCAL, and all references resolve to > the same definition. This semantics is only relevant for c++ plugin > systems and even there it's often not what the user wants (so it can > be turned off in g++ by -fno-gnu-unique when the c++ shared lib is > compiled). In musl just treat it like STB_GLOBAL. I agree with all the policy aspects here. > --- > ldso/dynlink.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/ldso/dynlink.c b/ldso/dynlink.c > index c6890845..476cd6a8 100644 > --- a/ldso/dynlink.c > +++ b/ldso/dynlink.c > @@ -286,11 +286,9 @@ static struct symdef find_sym(struct dso *dso, const char *s, int need_def) > continue; > if (!(1<<(sym->st_info&0xf) & OK_TYPES)) continue; > if (!(1<<(sym->st_info>>4) & OK_BINDS)) continue; > - > - if (def.sym && sym->st_info>>4 == STB_WEAK) continue; > def.sym = sym; > def.dso = dso; > - if (sym->st_info>>4 == STB_GLOBAL) break; > + break; > } This looks ok, but it might be nice to reformat the loop in a separate subsequent patch, since the break/continue usage is rather "unconventional" now. :-) > >From b5e20dbf30e67a12bc0931f12126553d036bbd31 Mon Sep 17 00:00:00 2001 > From: Szabolcs Nagy <nsz@...t70.net> > Date: Sat, 3 Dec 2016 23:08:44 +0000 > Subject: [PATCH 2/2] make relocation time symbol lookup and dlsym consistent > > Using common code path for all symbol lookups fixes three dlsym issues: > - st_shndx of STT_TLS symbols were not checked and thus an undefined > tls symbol reference could be incorrectly treated as a definition > (the sysv hash lookup returns undefined symbols, gnu does not, so should > be rare in practice). > - symbol binding was not checked so a hidden symbol may be returned > (in principle STB_LOCAL symbols may appear in the dynamic symbol table > for hidden symbols, but linkers most likely don't produce it). > - mips specific behaviour was not applied (ARCH_SYM_REJECT_UND). > > TODO: not benchmarked, common code should be inlined probably. Yes, I suspect we should have: #ifdef __GNUC__ __attribute__((__always_inline__)) #endif on this function or similar. But maybe it doesn't matter. We should check. > --- > ldso/dynlink.c | 83 +++++++++++++++++++--------------------------------------- > 1 file changed, 27 insertions(+), 56 deletions(-) > > diff --git a/ldso/dynlink.c b/ldso/dynlink.c > index 476cd6a8..e5a382b0 100644 > --- a/ldso/dynlink.c > +++ b/ldso/dynlink.c > @@ -256,14 +256,15 @@ static Sym *gnu_lookup_filtered(uint32_t h1, uint32_t *hashtab, struct dso *dso, > #define ARCH_SYM_REJECT_UND(s) 0 > #endif > > -static struct symdef find_sym(struct dso *dso, const char *s, int need_def) > +static struct symdef find_sym(struct dso *dso, const char *s, int need_def, int use_deps) > { > uint32_t h = 0, gh, gho, *ght; > size_t ghm = 0; > struct symdef def = {0}; > - for (; dso; dso=dso->next) { > + struct dso **deps = use_deps ? dso->deps : 0; > + for (; dso; dso=use_deps ? *deps++ : dso->next) { Spacing should really be changed here. a=b ? c : d is really bad. > Sym *sym; > - if (!dso->global) continue; > + if (!dso->global && !use_deps) continue; This will have to change a bit with the planned change to have a separate global dso chain. > if ((ght = dso->ghashtab)) { > if (!ghm) { > gh = gnu_hash(s); > @@ -332,7 +333,7 @@ static void do_relocs(struct dso *dso, size_t *rel, size_t rel_size, size_t stri > ctx = type==REL_COPY ? head->next : head; > def = (sym->st_info&0xf) == STT_SECTION > ? (struct symdef){ .dso = dso, .sym = sym } > - : find_sym(ctx, name, type==REL_PLT); > + : find_sym(ctx, name, type==REL_PLT, 0); > if (!def.sym && (sym->st_shndx != SHN_UNDEF > || sym->st_info>>4 != STB_WEAK)) { > error("Error relocating %s: %s: symbol not found", > @@ -1377,7 +1378,7 @@ void __dls2(unsigned char *base, size_t *sp) > /* Call dynamic linker stage-3, __dls3, looking it up > * symbolically as a barrier against moving the address > * load across the above relocation processing. */ > - struct symdef dls3_def = find_sym(&ldso, "__dls3", 0); > + struct symdef dls3_def = find_sym(&ldso, "__dls3", 0, 0); And one place we really don't need/want the always_inline, so maybe we need to be more clever about it. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.