musl - Re: [PATCH] math: move i386 sqrt to C

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200321175351.GJ11469@brightrain.aerifal.cx>
Date: Sat, 21 Mar 2020 13:53:51 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] math: move i386 sqrt to C

On Tue, Jan 07, 2020 at 04:06:05PM +0300, Alexander Monakov wrote:
> ---
> Since union ldshape does not have a dedicated field for 32 least significant
> bits of the x87 long double mantissa, keeping the original approach with
> 
>     ux.i.m -= (fpsr & 0x200) - 0x100;
> 
> would lead to a 64-bit subtraction that is not trivial for the compiler to
> optimize to 32-bit subtraction as done in the original assembly. Therefore
> I have elected to change the approach and use
> 
>     ux.i.m ^= (fpsr & 0x200) + 0x200;
> 
> which is easier to optimize to a 32-bit rather than 64-bit xor.
> 
> Thoughts?

I'm getting test failures with sqrt and this seems to be the culprit
-- I don't think it's equivalent. The original version could offset
the value by +0x100 or -0x100 before rounding, and offsets in the
opposite direction of the rounding that already occurred. Your version
can only offset it by +0x200 or -0x400.

The (well, one) particular failing case is:

src/math/ucb/sqrt.h:49: RU sqrt(0x1.fffffffffffffp+1023) want 0x1p+512
got 0x1.fffffffffffffp+511 ulperr -0.250 = -0x1p-1 + 0x1p-2

Here the mantissa is

fffffffffffffc00

and offset by -0x400 yields:

fffffffffffff800

which has exactly 53 bits and therefore does not round up like it
should.

I still like your approach better if there's a way to salvage it. Do
you see one?

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.