|
Message-ID: <alpine.LRH.2.02.2001240755500.28245@key0.esi.com.au> Date: Fri, 24 Jan 2020 11:42:02 +1100 (AEDT) From: Damian McGuckin <damianm@....com.au> To: musl@...ts.openwall.com Subject: Re: Considering x86-64 fenv.s to C In an attempt to address this issue, the following can be considered an attempt at a 'discussion paper'. The following are assumptions. Whether the assumptions are valid enough is debatable and open for discussion by virtue of this initial post. I will put the generic details and once there is some agreement to that, the details can be attacked, including nomenclature. The names are not locked in stone. They are just what is being used to illustrate the design. Comments relating to accepted practice in MUSL are welcome. These assumptions do not hold for i386, x32, and x86_64, for reasons that are largely a need to address their historical origins which at the time, were quite ground breaking. The 3 architectures need their own special handling although background research suggest that all three can be made generic with respect to each. But that is another story. I have written this at a level that was meant to be read by those with a low level of knowledge. That may be a mistake but it meant that I could get some level of initial peer review before posting this to the list! Ignoring Intel FPUs for the moment at least, it is assumed that all FPUs expose their state to, and allow control of themselves by, user programs through a) a status register which contains bits which are set when a floating point exception occurs, the exception bits, and b) a control register which contains a bit field in which is stored the currently active rounding mode. These registers may be one and the same. It is assumed that they can be stored in, or loaded from an unsigned int (of 32-bits). This appears to be the case for all considered architectures (see the list below). The interrogation and modification of both the status register and the control register can be done via the respective pair of 'routines' getsr/setsr and getcr/setcr which respectively copies the register's contents into a local variable, an unsigned int, and sets the register's contents from its unsigned int argument. These are simple #defines, being mapped to a) if the status and control bits are in the same register, one of static inline unsigned int get_csr() { .... } static inline void set_csr(unsigned int csr) { .... } where get_csr is mapped to both 'getsr' and 'getcr' and a) if the registers are in fact distinct, one static inline unsigned int get_sr() { .... } static inline void set_sr(unsigned int csr) { .... } static inline unsigned int get_cr() { .... } static inline void set_cr(unsigned int csr) { .... } There exists a 'const' default floating point environment which can be defined as a local variable as: const fenv_t fe_dfl_env = FE_DFL_ENV_DATA; where FE_DFL_ENV_DATA is a #define which is created automatically if the variable 'fe_dfl_env' is an unsigned int. There also exists a pair of macros, 'gete' and 'sete', and an assumed memory object '*e' which are defined such that #define gete(e) ... // extracts the FPU's environment into *e #define sete(e) ... // replaces the FPU's environment with *e fenv_t *e; Some examples of these architecture dependent routines and the associated definitions were crafted for m68k, powerpc64, and s390x, by copying the appropriate bits from how 'musl' currently does it. An approximation for examples for sparc64, mips64 and sh is provides, remembering that these are approximations only. The implementation for these cases is likely to be at least less than optimal if not wrong. Please amend as appropriate. Also, at least one architecture has an expanded idea of the concept of an INVALID exception. For such scenarios, 'feclearexcept' and 'feraiseexcept' need macros to be defined which massage the exceptions bits 'excepts' that come a user's program, respectively #define FE_QUALIFY_CLEAR_EXCEPT(excepts) .... and #define FE_QUALIFY_RAISE_EXCEPT(excepts) .... For any other scenarios, these macros never need to be defined explicitly, the code will define them as empty by itself. All routines validate the exceptions bits 'excepts' coming a user's program. This is achieve with a macro, the default of which is #define FE_VALIDATE_EXCEPT(except) except &= FE_ALL_EXCEPTS This simply discards any bits which are not a valid exception bit. It is created automatically and does not need to be defined. However, where such an approach for the case of illegal exceptions is deemed to unsatisfactory and for example routine failure and a non-zero return is deems a better approach, that macro can redefined explicitly as #define FE_VALIDATE_EXCEPT(except) if ((except) & FE_ALL_EXCEPTS) return(-1) Historically it seems only one architecture took this approach, 'm68k', and then only on Linux. On any other architecture, and on *BSD, the earlier, the default approach is used. An attempt at an implementation is shown below, WITHOUT the architecture dependent macro definitions and 'static inline' routines which largely just comprise one line of embedded assembler. Note that raising exceptions is done with arithmetic expression for trivial cases and by directly modifying the floating point status register for the non-trivial case. The approach is open to question. While it allows testing in isolation, it does not use any of the latest barrier function approaches and routines like 'math_uflow' and colleagues. It might be advantages to focus on the overall design first and then discuss how to raise exceptions later so as to not detract from the discussion of the overall design of the generic interface. Please consider: /* * default inspection and modification of FPU status and control registerd */ #ifndef getcr #define getcr get_csr #endif #ifndef setcr #define setcr set_csr #endif #ifndef getsr #define getsr get_csr #endif #ifndef setsr #define setsr set_csr #endif #ifdef FE_DFL_ENV_DATA #define FE_DFL_ENV_DATA 0 /* assume this is a unsigned int - cast?? */ #endif #ifndef gete #define gete(e) *e = get_csr() /* assume this is a basic type */ #endif #ifndef sete #define sete(e) set_csr(*e) /* assume this is a basic type */ #endif #ifndef FE_ALL_EXCEPT #define FE_VALID_EXCEPT (FE_INEXACT|FE_DIV_BY_ZERO|FE_UNDERFLOW|FE_OVERFLOW) #define FE_ALL_EXCEPT (FE_INVALID|FE_VALID_EXCEPT) #endif /* * The usual approach to validating an exception mask is to throw out * anything which is illegal. Some systems will NOT do this, choosing * instead to return a non-zero result on encountering an illegal mask */ #ifndef FE_VALIDATE_EXCEPT #define FE_VALIDATE_EXCEPT(e) e &= FE_ALL_EXCEPT #endif /* * There is usually no need to mess the generic valid exceptions bits * given to 'feclearexcept' and 'feraiseexcept' so define a 'nothing' * macro for such a scenario - is this the MUSL accepted style???? */ #ifndef FE_QUALIFY_CLEAR_EXCEPT #define FE_QUALIFY_CLEAR_EXCEPT(excepts) (0) #endif #ifndef FE_QUALIFY_RAISE_EXCEPT #define FE_QUALIFY_RAISE_EXCEPT(excepts) (0) #endif /* * Compute the rounding mask with some rigid logic */ #ifndef FE_TO_NEAREST #define FE_TO_NEAREST 0 #define ROUND_MASK ((unsigned int) 0) #else #ifndef FE_TOWARDS_ZERO #define ROUND_MASK ((unsigned int) 0) #else #ifdef FE_DOWNWARD #ifdef FE_UPWARD #define ROUND_MASK ((unsigned int) (FE_DOWNWARD|FE_UPWARD|FE_TOWARDZERO)) #else #define ROUND_MASK UNSUPPORTED rounding #endif #else #ifdef FE_UPWARD #define ROUND_MASK UNSUPPORTED rounding #else #define ROUND_MASK ((unsigned int) (FETOWARDZERO)) #endif #endif #endif #endif int feclearexcept(excepts) { FE_VALIDATE_EXCEPT(excepts); FE_QUALIFY_CLEAR_EXCEPT(excepts); setsr(getsr() & ~((unsigned int) excepts)); return(0); } static inline int __raisearithmetically(int excepts) { /* * assume single OP is faster than double OP */ const float one = (float) 1; const float zero = (float) 0; const float tiny = (float) 0x1.0p-126; const float huge = (float) 0x1.0p+126; volatile float x; /* * if it is just a simple exception, arithmetic expressions are optimal */ switch(excepts) { case FE_INVALID: x = zero, x /= x; break; case FE_DIVBYZERO: x = zero, x = one / x; break; case FE_INEXACT: x = tiny, x += one; break; case (FE_OVERFLOW | FE_INEXACT): x = huge, x *= x; break; case (FE_UNDERFLOW | FE_INEXACT): x = tiny, x *= x; break; default: /* if more than one exception exists, a sledgehammer is viable */ setsr(getsr() | ((unsigned int) excepts)); break; } return(0); } int feraiseexcept(int excepts) { FE_VALIDATE_EXCEPT(excepts); FE_QUALIFY_RAISE_EXCEPT(excepts); return __raisearithmetically(excepts); } int fetestexcept(int excepts) { FE_VALIDATE_EXCEPT(excepts); return (int) (getsr() & ((unsigned int) excepts)); } int fegetround(void) { return (int) (getcr() & ROUND_MASK); } int fesetround(int rounding_mode) { if ((rounding_mode & ~ROUND_MASK) == 0) { unsigned int mode = ((unsigned int) rounding_mode); return (setcr((getcr() & ~ROUND_MASK) | (mode & ROUND_MASK)), 0); } return(-1); } int fegetenv(fenv_t *envp) { return ((gete(envp)), 0); } int fesetenv(fenv_t *envp) { fenv_t envpd = FE_DFL_ENV_DATA, *e = envp == FE_DFL_ENV ? &envpd : envp; return ((sete(envp)), 0); } Regards - Damian Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037 Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here Views & opinions here are mine and not those of any past or present employer
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.