musl - Re: Further dynamic linker optimizations

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150701084129.4b73bfd6@vostro>
Date: Wed, 1 Jul 2015 08:41:29 +0300
From: Timo Teras <timo.teras@....fi>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: Further dynamic linker optimizations

On Tue, 30 Jun 2015 16:04:54 -0400
Rich Felker <dalias@...c.org> wrote:

> Discussion on #musl with Timo Teräs has produced the following
> results:

Nice summary. Thanks!

> - The whole outer for loop in find_sym is the hot path for
>   performance. As such, eliminating the lazy calculation of gnu_hash
>   and simply doing it before the loop should be a measurable win, just
>   by removing the if (!ghm) branch.

Additional thought. We could do a skip list here. If we calculate the
gnu-hash unconditionally, we could bloom filter bits to construct a
skip list.

That is, we have next_symlookup[] array that has pointer for each
wordsize bits (or potentially a small multiple of it). And we would link
each dso in next_symlookup array corresponding to each bloom filter
bit (for dso without gnu-hash it'd have to go to all of them). Then on
lookup we could just use the calculated bloomfilter to follow the
correct symlookup chain next pointers.

If the pointer array size is less than the bloom filter size, the bloom
filter can be always reduced by |= individual elements together.
Though, it'd probably need some analysis on how this would work out. If
ORring all elements together always yields all bits set, this is kinda
useless.

This should be significant win on cases like clang when there are tens
of thousands of symbol lookups, and 100+ dsos. Trade off is of course
little memory and little extra time to setup the additional chains.

Thoughts?

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.