|
Message-ID: <20211227184137.GY7074@brightrain.aerifal.cx> Date: Mon, 27 Dec 2021 13:41:38 -0500 From: Rich Felker <dalias@...c.org> To: Markus Wichmann <nullplan@....net> Cc: musl@...ts.openwall.com Subject: Re: ASM-to-C conversion for i386 On Mon, Dec 27, 2021 at 07:04:37PM +0100, Markus Wichmann wrote: > On Mon, Dec 27, 2021 at 11:30:56AM -0500, Rich Felker wrote: > > One thought, and I'm not sure if this is a good idea or a bad one but > > worth discussing: > > > > Using your acos.c as an example, where you have the comment: > > > > atan2(fabs(sqrt((1-x)*(1+x))), x) > > > > That comment was copied from acos.s. In general, I have tried to > preserve comments. Except in fenv.s, where each time __hwcap was tested, > the same comment was prefixed, and its point should be coming across ten > times more easily by just creating a symbolic constant. > > > The actual code could be written as: > > > > return (double)x87_fpatan(x, x87_fabs(x87_fsqrt((1-x)*(1+x)))); > > > > with the appropriate "x87.h" defining each of these with the > > appropriate asm & constraints. This kinda makes the individual > > functions self-documenting and non-error-prone (repetition of > > error-prone constraints, especially the hidden requirement that, in > > "=t"(x), x have type long double). > > > > That's probably an even better idea than what I am currently doing: > Moving the "core" functions into a new header file (as static inline > functions), and using these in the function implementations. I could not > get all the duplication out; in some cases the duplication is only > conceptual (hypot() and hypotf() have the same idea, but it needs to be > implemented differently due to the different precisions/representations). > > I think I can combine both approaches, because what I'm doing appears to > have the effect of moving the __asm__ statements entirely out of the C > files into the new header file. And it appears that we are only using a > couple of instructions, anyway. > > Downside is that implementing > > static inline long double x87_fabs(long double x) { > __asm__("fabs" : "+t"(x)); > return x; > } > > now actually carries the connotation that the result is of > double-extended precision and needs rounding before being returned. > Unlike the current version which does not do that. However, to my > knowledge that will not actually be wrong, only slower, so a solution > that preserves the current connotations for these few instructions can > probably be considered a micro-optimization. Yes, I think the insns that can emit other precisions probably would need 3 versions, but there are very few of these -- just fabs and fmod? Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.