Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200321175351.GJ11469@brightrain.aerifal.cx>
Date: Sat, 21 Mar 2020 13:53:51 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] math: move i386 sqrt to C

On Tue, Jan 07, 2020 at 04:06:05PM +0300, Alexander Monakov wrote:
> ---
> Since union ldshape does not have a dedicated field for 32 least significant
> bits of the x87 long double mantissa, keeping the original approach with
> 
>     ux.i.m -= (fpsr & 0x200) - 0x100;
> 
> would lead to a 64-bit subtraction that is not trivial for the compiler to
> optimize to 32-bit subtraction as done in the original assembly. Therefore
> I have elected to change the approach and use
> 
>     ux.i.m ^= (fpsr & 0x200) + 0x200;
> 
> which is easier to optimize to a 32-bit rather than 64-bit xor.
> 
> Thoughts?

I'm getting test failures with sqrt and this seems to be the culprit
-- I don't think it's equivalent. The original version could offset
the value by +0x100 or -0x100 before rounding, and offsets in the
opposite direction of the rounding that already occurred. Your version
can only offset it by +0x200 or -0x400.

The (well, one) particular failing case is:

src/math/ucb/sqrt.h:49: RU sqrt(0x1.fffffffffffffp+1023) want 0x1p+512
got 0x1.fffffffffffffp+511 ulperr -0.250 = -0x1p-1 + 0x1p-2

Here the mantissa is

fffffffffffffc00

and offset by -0x400 yields:

fffffffffffff800

which has exactly 53 bits and therefore does not round up like it
should.

I still like your approach better if there's a way to salvage it. Do
you see one?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.