|
Message-ID: <1432535489.2715.1.camel@inria.fr>
Date: Mon, 25 May 2015 08:31:29 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: Re: ppc soft-float regression
Am Sonntag, den 24.05.2015, 20:36 -0400 schrieb Rich Felker:
> There's a simple alternative I just came up with though: have
> dlstart.c compute the number of REL entries that need their addends
> saved and allocate a VLA on its stack for stages 2 and 3 to use. While
> the number of addends could be significant, it's many orders of
> magnitude smaller than the smallest practical stack sizes we could
> actually run with, so it's perfectly safe to put it on the stack.
>
> Here're the basic changes I'd like to make to dlstart.c to implement
> this:
>
> 1. Remove processing of DT_JMPREL REL/RELA table. All entries in this
> table are necessarily JMP_SLOT type, and thus symbolic, so there's
> nothing stage 1 can do with them anyway. Also, being JMP_SLOT type,
> they have implicit addends of zero if they're REL-type, so there's
> no need to save addends.
>
> 2. Remove the loop in dlstart.c that works like a fake function call 3
> times to process DT_JMPREL, DT_REL, and DT_RELA. Instead we just
> need 2 iterations, and now the stride is constant in each, so they
> should simplify down a lot more inline.
>
> 3. During the loop that processes DT_REL, count the number of
> non-relative relocations (ones we skip at this stage), then make a
> VLA this size and pass its address to __dls2 as a second argument.
>
> 4. Have the do_relocs in stage 2 save addends in this provided array
> before overwriting them, and save its address for use by stage 3.
>
> 5. Have the do_relocs in stage 3 (for ldso/libc only) pull addends
> from this array instead of of from inline.
>
> Steps 1 and 2 are purely code removal/simplification and should be
> done regardless of whether we move forward on the above program, I
> think. Steps 3-5 add some complexity but hardly any code, just a few
> lines here and there.
>
> Comments?
I like it.
The thing that is a bit critical here, is the VLA. Not because it is a
VLA as such, but because it is a dynamic allocation on the stack. We
already have a similar strategy in pthread_create for TLS. The
difference is that there we have
- a sanity check
- an alternative strategy if the sanity check fails
Would there be a possibility to have both here, too?
Thanks
Jens
--
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536 ::
:: :::::::::::::::::::::: gsm France : +33 651400183 ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::
Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.