Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211227180437.GE1949@voyager>
Date: Mon, 27 Dec 2021 19:04:37 +0100
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: Re: ASM-to-C conversion for i386

On Mon, Dec 27, 2021 at 11:30:56AM -0500, Rich Felker wrote:
> One thought, and I'm not sure if this is a good idea or a bad one but
> worth discussing:
>
> Using your acos.c as an example, where you have the comment:
>
> 	atan2(fabs(sqrt((1-x)*(1+x))), x)
>

That comment was copied from acos.s. In general, I have tried to
preserve comments. Except in fenv.s, where each time __hwcap was tested,
the same comment was prefixed, and its point should be coming across ten
times more easily by just creating a symbolic constant.

> The actual code could be written as:
>
> 	return (double)x87_fpatan(x, x87_fabs(x87_fsqrt((1-x)*(1+x))));
>
> with the appropriate "x87.h" defining each of these with the
> appropriate asm & constraints. This kinda makes the individual
> functions self-documenting and non-error-prone (repetition of
> error-prone constraints, especially the hidden requirement that, in
> "=t"(x), x have type long double).
>

That's probably an even better idea than what I am currently doing:
Moving the "core" functions into a new header file (as static inline
functions), and using these in the function implementations. I could not
get all the duplication out; in some cases the duplication is only
conceptual (hypot() and hypotf() have the same idea, but it needs to be
implemented differently due to the different precisions/representations).

I think I can combine both approaches, because what I'm doing appears to
have the effect of moving the __asm__ statements entirely out of the C
files into the new header file. And it appears that we are only using a
couple of instructions, anyway.

Downside is that implementing

static inline long double x87_fabs(long double x) {
    __asm__("fabs" : "+t"(x));
    return x;
}

now actually carries the connotation that the result is of
double-extended precision and needs rounding before being returned.
Unlike the current version which does not do that. However, to my
knowledge that will not actually be wrong, only slower, so a solution
that preserves the current connotations for these few instructions can
probably be considered a micro-optimization.

Ciao,
Markus

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.