|
Message-ID: <5C60D05C95724A36B3DB9942D06CFE5F@H270> Date: Wed, 11 Aug 2021 00:53:37 +0200 From: "Stefan Kanthak" <stefan.kanthak@...go.de> To: "Szabolcs Nagy" <nsz@...t70.net> Cc: <musl@...ts.openwall.com> Subject: Re: [PATCH] Properly simplified nextafter() Szabolcs Nagy <nsz@...t70.net> wrote: >* Stefan Kanthak <stefan.kanthak@...go.de> [2021-08-10 08:23:46 +0200]: >> <https://git.musl-libc.org/cgit/musl/plain/src/math/nextafter.c> >> has quite some superfluous statements: >> >> 1. there's absolutely no need for 2 uint64_t holding |x| and |y|; >> 2. IEEE-754 specifies -0.0 == +0.0, so (x == y) is equivalent to >> (ax == 0) && (ay == 0): the latter 2 tests can be removed; > > you replaced 4 int cmps with 4 float cmps (among other things). and hinted that the result of the second pair of comparisions is already known from the first pair. > it's target dependent if float compares are fast or not. It's also target dependent whether the floating-point registers can be accessed by integer instructions, or need to be copied: some win, some loose! Just let the compiler/optimizer do its job! > (the i386 machine where i originally tested this preferred int > cmp and float cmp was very slow in the subnormal range and > iirc it also raises the non-standard input denormal exception, > which is fine i guess. This exception resp. the (sticky) flag is explicitly raised/set in the part following the patch. > of course soft float abis much prefer int cmp so your code is > likely much slower and bigger there). 0. Doesn't musl provide target specific routines for targets with soft FP? 1. If not: the compiler knows the target ABI and SHOULD generate the proper integer comparisions there. > but i'm not against the change, it is likely better on modern > machines. did you try to benchmark it? or check the code size? I STILL don't run a system supported by musl. The code is of course smaller ... but not as small and fast as a proper i386 or AMD64 assembly implementation ... which I can post upon request. regards Stefan >> 3. there's absolutely no need to compare the signs of x and y >> with the sign of the direction: its sufficient to test that >> direction and sign of x match; >> 4. a proper compiler/optimizer should be able to reuse the results >> of the comparision (x == y) for (x < y) or (x > y) and >> (x == 0.0) for (x < 0.0) or (x > 0.0). >> >> JFTR: if ((x < 0.0) == (x < y)) is equivalent to >> if ((x > 0.0) == (x > y)) >> >> --- -/src/math/nextafter.c >> +++ +/src/math/nextafter.c >> @@ -3,20 +3,15 @@ >> double nextafter(double x, double y) >> { >> union {double f; uint64_t i;} ux={x}, uy={y}; >> - uint64_t ax, ay; >> int e; >> >> if (isnan(x) || isnan(y)) >> return x + y; >> - if (ux.i == uy.i) >> + if (x == y) >> return y; >> - ax = ux.i & -1ULL/2; >> - ay = uy.i & -1ULL/2; >> - if (ax == 0) { >> - if (ay == 0) >> - return y; >> + if (x == 0.0) >> ux.i = (uy.i & 1ULL<<63) | 1; >> - } else if (ax > ay || ((ux.i ^ uy.i) & 1ULL<<63)) >> + else if ((x < 0.0) == (x < y)) >> ux.i--; >> else >> ux.i++;
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.