musl - Re: ldso : dladdr support

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120807114627.GG30810@port70.net>
Date: Tue, 7 Aug 2012 13:46:27 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: musl@...ts.openwall.com
Subject: Re: ldso : dladdr support

* musl <b.brezillon.musl@...il.com> [2012-08-07 11:04:19 +0200]:
> This patch adds support for dladdr function.
> It is based on my previous patch (gnu hash support).
> 

i havent checked the content of your patch yet
just had a quick glance

i think before you make substantial changes
it's better to have some discussion about
the design

i'll just have some stylistic comments for now
(i think there is no official guideline and
we are not terribly anal about it but i'll
still point out a few things for future reference)

>  struct hash_algo {
>  	uint32_t (*hash) (const char *);
> -	Sym *(*lookup) (const char *s, uint32_t h, struct dso *dso);
> +	Sym *(*lookup) (const char *, uint32_t, struct dso *);
> +	void (*iterate) (struct dso *, int (*) (struct dso *, Sym *, void *), void *);
>  };
>  

use
  foo(arg)
instead of
  foo (arg)

in case of keywords it's not clear what's better
(musl usually uses spaces after if,while,for,..)
but for function calls no space is clearer

(in general musl code rarely uses unnecessary spaces and parentheses)

actually i'm not sure if the function pointer approach is
the right one here, but that's a design question


> +	uint32_t *hashtab = dso->hashtab;
> +	uint32_t *buckets = hashtab + 4 + (hashtab[2] * (sizeof(size_t) / sizeof(uint32_t)));

i don't know the algorithm but the sizeof magic looks suspicious to me

> +static uint32_t gnu_hash (const char *s0)
> +{
> +	const unsigned char *s = (void *)s0;
> +	uint_fast32_t h = 5381;
> +	for (unsigned char c = *s; c != '\0'; c = *++s)
> +		h = h * 33 + c;
> +	return h & 0xffffffff;
> +}
> +

i think c != '\0' is not very idiomatic
and i'd avoid using c99 style declaration in for

use

for (; *s; s++)
	h = 33*h + *s;


> +	uint32_t shift2 = hashtab[3];
> +	uint32_t h2 = h1 >> shift2;

i'm not sure if input validation makes sense in ldso
but shifts can be tricky (hashtab[3] must be in 0..31 here)

> +	for (h1 &= ~1; 1; sym++) {

the 1; in the middle is unnecessary

h1 &= ~1; is problematic if signed int representation is
not two's complement
(~1 is an implementation defined negative value which is
then converted to uint32_t according to well defined rules)

so i'd use

  for (h1 &= -2;; sym++) {

which is probably less clear but more correct
(maybe an explicit (uint32_t)-2 cast helps with that)

> -	if (h==0x6b366be && !strcmp(s, "dlopen")) rtld_used = 1;
> -	if (h==0x6b3afd && !strcmp(s, "dlsym")) rtld_used = 1;
> -	if (h==0x595a4cc && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
> +	uint32_t computed[HASH_ALG_CNT / 32 + 1];
> +	uint32_t hashes[HASH_ALG_CNT];
> +	memset (computed, 0, sizeof (computed));
> +	if (!strcmp(s, "dlopen")) rtld_used = 1;
> +	if (!strcmp(s, "dlsym")) rtld_used = 1;
> +	if (!strcmp(s, "__stack_chk_fail")) ssp_used = 1;
...
> +		if (!(computed[dso->hashalg / 32] & (1 << (dso->hashalg % 32)))) {
> +			h = hashalgs[dso->hashalg].hash(s);
> +			hashes[dso->hashalg] = h;
> +			computed[dso->hashalg / 32] |= (1 << (dso->hashalg % 32));
> +		}
> +		else {
> +			h = hashes[dso->hashalg];
> +		}

i think we can have more efficient code here if
only gnu and sysv hash algos are supported

(i don't think it's realistic to assume new hash algorithms
even gnu hash is a fairly useless addition to elf)

the if else style is

  if (cond) {
  ...
  } else {
  ...
  }
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.