Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190223203621.GY21289@port70.net>
Date: Sat, 23 Feb 2019 21:36:23 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: musl@...ts.openwall.com, Shane Seelig <stseelig@...l.com>
Subject: Re: x87 asin and acos

* Szabolcs Nagy <nsz@...t70.net> [2019-02-23 21:30:31 +0100]:

> * Shane Seelig <stseelig@...l.com> [2019-02-23 09:21:08 -0500]:
> > Currently 'asin' uses the algorithm:
> > 	arcsin(x) == arctan(x/(sqrt((1-x)(1+x))))
> > If the following algorithm were to be used instead, an 'fadd' could be
> > removed.
> > 	arcsin(x) == arctan(x/(sqrt(1-x**2)))
> 
> that change seems valid as far as the result is concerned.
> (the worst case rounding error of the sqrt argument should
> be around LDBL_EPS in both cases)
> 
> but the fenv behaviour is not valid:
> for tiny x, x*x raises spurious underflow exception
> for large x, x*x raises spurious overflow exception
> 
> (1-x)(1+x) avoids these issues.

hm actually this does not solve the overflow prolem,
indeed current asinl(0x1p10000L) raises spurious
overflow, so it's not correct.

(it's unlikely to matter in practice, the valid input
domain is [-1,1] anyway, but we should probably fix
that)

> 
> note that there is not much performance difference between
> the two expressions: 1-x and 1+x are independent computations
> so the latency is 1 add + 1 mul in both cases, and the entire
> function is likely dominated by fpsqrt followed by fpatan
> both of which have huge latency compared to add or mul.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.