|
Message-ID: <alpine.LRH.2.02.2001171401570.27694@key0.esi.com.au> Date: Fri, 17 Jan 2020 14:36:20 +1100 (AEDT) From: Damian McGuckin <damianm@....com.au> To: musl@...ts.openwall.com Subject: Re: Considering x86-64 fenv.s to C Feedback/Discussion please, especially in terms of what extra comments I need to make? I hope I have not missed anything. General Comments **************** Except where noted, the approach taken to invalid input is to mask out the invalid data, use what data is left, and never inform the calling program of invalid data. The i386(sometimes), X32 and X86-64 generally need to realise that they have both the X87 FPU and the SSE. Are there scenarios where this will not be the case or do we need to plan for future scenarious where this will not be the case? Do we need to consider what is in the latest IEEE 754 2019 standard to see what enhancements are needed or just wait for C2X? Other Architectures ******************* Should we look at what is needed for Sparc and Power9 to ensure that the (eventually-) chosen abstraction will work with these? Are there any other chips which need to be considered. If you look at more recent chipset designs, they have all been able to leverage the experience of working with IEEE 754 exceptions and rounding and follow the same style of use of an exception status and round control register . So I think catering for the current crop, plus those 2 mentioned above, should be adequate. But am I wrong? Is Power9 the same as PowerPC64? I have never seen one. I know I do not know enough about this chip as the 128-bit floating point discussion talks about Rounding-To-Odd mode? I have tried to read the 1358 pages of the ISA 3.0 architecture manual but I have a long way to go before I know even 10% of what is in there. Are the newer beefy ARMS likely to change what they do not in the context of 'fenv' routines? Also, and I could be wrong, currently MUSL assumes that there is an integral type for every floating type. On some architectures, I believe this is not always the case for 128-bit floating point numbers. On some Sparcs, I am not sure it was even the case for 64-bit numbers but that was a long time. I do not think that this restriction will influence anything here. How it affects MUSL in general is another question irrelevant to this discussion. Summary ******* aarch64 (arm) * All assembler arm (bare) * Empty i386 * All assembler * The fldenv instruction to update the status registers has a serious overhead which cannot be avoided in 'feraiseexcept'. No attempt is made to optimize any unnecessary usage (as occurs in feclearexcept). Note that fldenv also makes the 'feclearexcept' routine unavoidably complex. * What is the best way to query '__hwcap' from inline __asm__ statement, specifically to verify if SSE instructions have to be supported m68k * In C. * Very clear * feclearexcept and feraiseexcept if (exception_mask & ~FE_ALL_EXCEPT) return (-1) Different to the way others handle invalid input. Is this cast behaviour cast in stone based on standard documentation? mips/mips64/mipsn32 * All assembler * Not overly complex. powerpc * All assembler * I think that this architecture has more exception bits than IEEE 754 specifies. It has lots of specific cases of FE_INVALID. This needs to be considered when dealing with FE_INVALID. * My knowledge of this assembler is poor. Please expand these comments!! powerpc64 * In C. * Very clear * Note that this architecture has more exception bits than IEEE 754 specifies. It has lots of specific cases of FE_INVALID. This needs to be considered when dealing with FE_INVALID. * This is the first time I have seen this style of coding to cast a double to a union and then extract the data as a long. return (union {double f; long i;}) {get_fpscr_f()}.i; Is this style of coding universally accepted within MUSL? From my reading of other routines, it is normally done as union {double f; long i;} f = { get_fpscr_f() }; return f.i; Just curious. riscv64 * All assembler. * Very clear. * The architecture has obviously been done after a review of lots of experience with the IEEE 754 standard. s390x * In C. * Very clear. * Why is __fesetround(int) 'hidden'? Where is fesetround()? sh (SuperH??) * In assembler * I know zero about this assembler * There is some pecularity about updating the environment. I have no idea what is going on here. Anybody clear to elaborate? x32 * In assembler * Why does 'feclearexcept' disrespect the flags by clearing ALL x86 bits? * It is this really much the same as x86-64 (or am I wrong)? x86_64 * In assembler * Why does 'feclearexcept' disrespect the flags by clearing ALL x86 bits? *** FINISH
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.