Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.2001261420060.4568@key0.esi.com.au>
Date: Sun, 26 Jan 2020 14:28:17 +1100 (AEDT)
From: Damian McGuckin <damianm@....com.au>
To: musl@...ts.openwall.com
Subject: Re: Considering x86-64 fenv.s to C


Latest below - Apologies if I missed anything about which we spoke.

*	Lots more comments about architecture.

*	better explanation about dealing with those PowerPC issues.

*	some naming is improved

I will shortly post some guesses at what I think some hardware specific 
versions of fenv.c need to be.

Not sure how best to handle these because the way used currently is to 
pull in a template from directory above as seen in

 	.../musl-1.1.24/src/fenv/arm/fenv.c

which just does

 	#include "../fenv.c"

which includes the most trivial of fenv(3) implementations possible.

The following passes 'splint'.

/*
  * fenv-generic.c:
  *
  * All generic architectures are assumed to have
  *
  * a)	a status register in which the exception flags exist
  * b)	a control register in which the rounding mode exists
  * c)	an environment structure, fenv_t
  *
  * Routines must be provided with the architecture specific fenv.c routines
  * to both store (copy) these register into memory and load them from memory.
  *
  * If the control & status registers are distinct, then the declaration
  *
  *	static unsigned int fe_get_sr_arch() { ... }
  *	static void fe_set_sr_arch(unsigned int) { ... }
  *	static unsigned int fe_get_cr_arch() { ... }
  *	static void fe_set_cr_arch(unsigned int) { ... }
  *
  * and the macros that use these internal routines
  *
  *	#define fe_get_sr	fe_get_sr_arch
  *	#define fe_set_sr	fe_set_sr_arch
  *	#define fe_get_cr	fe_get_cr_arch
  *	#define fe_set_cr	fe_set_cr_arch
  *
  * need to be provided.
  *
  * If the control & status register are one and the same, then the declaration
  *
  *	static unsigned int fe_get_csr_arch() { ... }
  *	static void fe_set_csr_arch(unsigned int) { ... }
  *
  * needs to be provided. Those fe_[gs]et_[sc] macros are auto-created.
  *
  * Also needed are definitions that respectively
  *
  * a) allows a (memory) fenv_t object *e to be filled with the contents
  *    of the current floating point environment, and
  * b) allows the current floating point environment to mirror the state
  *    of the contents of some (memory) fenv_t object *e.
  *
  *	#define fe_get_e(e)	...
  *	#define fe_set_e(e)	...
  *
  * The default state of the floating point environment must be defined as
  *
  *	#define FE_DFL_ENV_DATA	....
  *
  * Where an architecture like a PowerPC64 supports a list of causes for an
  * INVALID exceptions, the bit mask encapsulating all those causes must be
  * defined, FE_ALL_VALID_CAUSES, as well as the bit field associated with
  * an FE_INVALID exception being caused 'by (an explicit software) request'.
  * They will commonly be defined using the existing ABI as:
  *
  *	#define FE_ALL_INVALID_CAUSES	FE_ALL_INVALID
  *	#define FE_INVALID_BY_REQUEST	FE_INVALID_SOFTWARE
  *
  * Where no causes exist, the above 2 lines must NEVER be defined!
  *
  * Once the above inline functions and appropriate macros have been defined
  * within an 'fenv.c' for a given architecture, the ''fenv.c' should pull
  * this file into its compilation space following existing practice like
  *
  *	#include	"../fenv-generic.c"
  */

/*
  * default inspection and modification of FPU status and control registers
  */
#ifndef	fe_get_cr
#define	fe_get_cr	fe_get_csr_arch
#endif
#ifndef	fe_set_cr
#define	fe_set_cr	fe_set_csr_arch
#endif
#ifndef	fe_get_sr
#define	fe_get_sr	fe_get_csr_arch
#endif
#ifndef	fe_set_sr
#define	fe_set_sr	fe_set_csr_arch
#endif
/*
  * Compute the rounding mask with some rigid logic
  */
#ifndef	FE_TONEAREST
#define	FE_TONEAREST	0
#define	ROUND_MASK	((unsigned int) 0)
#else
#ifndef	FE_TOWARDZERO
#define	ROUND_MASK	((unsigned int) 0)
#else
#ifdef	FE_DOWNWARD
#ifdef	FE_UPWARD
#define	ROUND_MASK	((unsigned int) (FE_DOWNWARD|FE_UPWARD|FE_TOWARDZERO))
#else
#define	ROUND_MASK	UNSUPPORTED rounding
#endif
#else
#ifdef	FE_UPWARD
#define	ROUND_MASK	UNSUPPORTED rounding
#else
#define	ROUND_MASK	((unsigned int) (FE_TOWARDZERO))
#endif
#endif
#endif
#endif

/*
  * The PowerPC architecture has the concept of an INVALID exception cause,
  * basically the reason why an FE_INVALID exception has occurred. There are
  * nine such causes, encapsulated in a non-contiguous mask of 9 bits, the
  * macro called (herein) FE_ALL_INVALID_CAUSES for clarity. Instructions
  * which involve invalid operations will set the CAUSE bit first, that
  * action implicitly raising the FE_INVALID exception, i.e. an FE_INVALID
  * implies the existence of a CAUSE bit, and vica versa.  Even on this a
  * PowerPC, MUSL supports only one of those 9 hardware causes, the macro
  * called (herein)  FE_INVALID_BY_REQUEST. These nomenclature is used to
  * document that not only do these macros refer to some cause, but that the
  * FE_INVALID exception has been caused 'by (an explicit software) request'.
  *
  * On any other architecture, no such definitions will exist of the mask of
  * all such causes, the macro FE_ALL_INVALID_CAUSES, nor will the macro
  * FE_INVALID_BY_REQUEST exist. In that case, the former is just defined as
  * zero, implying that there are NO causes, while the latter will be just
  * defined as FE_INVALID to keep the compiler happy (as it is never used
  * in any logic).  See fenv.c for the PowerPC64 for more detailed words on
  * this interesting, useful, fascinating, but highly non-standard feature.
  *
  * The above is important because ignoring it stabs MUSL in the back. When
  * it is desired that an FE_INVALID exception be cleared on an architecture
  * with a bit mask of causes as to why any FE_INVALID occurred, any CAUSE
  * bit field must also to be cleared. If this is not done, any set CAUSE
  * bit, i.e. one left uncleared, will cause an FE_INVALID exception to be
  * re-raised when the register image containing those uncleared bits is
  * fed back into 'fe_set_sr' where the instruction * 'Move to FPSCR' is
  * used. The single line of code following doing such handling is avoided
  * on architectures which do not need it because the zero value of the
  * FE_ALL_INVALID_CAUSE mask should mean that this line of code disappears
  * into oblivion by compile-time optimization.
  *
  * On architectures where a raised FE_INVALID exception implies some CAUSE
  * bit is also raised (or set), consistently is provided by saying that the
  * FE_INVALID is caused 'by (an explicit software) request', the bit named
  * (herein) as FE_INVALID_BY_REQUEST.  The single line of code doing such
  * handling is again avoided on architectures which do not need it because
  * the zero value of the FE_ALL_INVALID_CAUSE mask should mean that this
  * line of code is optimized into oblivion at compile-time. To keep the
  * compiler happy, FE_INVALID_BY_REQUEST is simply set to FE_INVALID.
  */
#ifndef	FE_ALL_INVALID_CAUSES
#define	FE_ALL_INVALID_CAUSES	0
#define	FE_INVALID_BY_REQUEST	FE_INVALID
#endif

static inline void
__cleargeneric(int excepts)
{
 	/*
 	 * Address where FE_INVALID is to be cleared on an architecture with a
 	 * matching CAUSE to say why that FE_INVALID occurred. To handle this,
 	 * every one of the CAUSE bits must also be cleared - see above.
 	 */
 	if (FE_ALL_INVALID_CAUSES != 0 && (excepts & FE_INVALID) != 0)
 		excepts |= FE_ALL_INVALID_CAUSES;

 	fe_set_sr(fe_get_sr() & ~excepts);
}

static inline void
__raisegeneric(int excepts)
{
 	/*
 	 * Address where FE_INVALID is to be raised on an architecture demanding
 	 * a matching CAUSE to say why the FE_INVALID is being raised. To handle
 	 * this, say it is caused 'By (Explicit Software) Request' - see above.
 	 */
 	if (FE_ALL_INVALID_CAUSES != 0 && (excepts & FE_INVALID) != 0)
 		excepts |= FE_INVALID_BY_REQUEST;

 	fe_set_sr(fe_get_sr() | excepts);
}

static inline void
__raisecleverly(int excepts)
{
 	/*
 	 * assume single OP is faster than double OP
 	 */
 	const float zero = (float) 0;
 	const float one = (float) 1;
 	const float large = 2.076919e+34; // 2^(+108)
 	const float small = 4.814826e-35; // 2^(-108)
 	volatile float x;
 	/*
 	 * if it is just a simple exception, arithmetic expressions are optimal
 	 */
 	switch(excepts)
 	{
 	case FE_INVALID:
 		x = zero, x /= x;
 		break;
 	case FE_DIVBYZERO:
 		x = zero, x = one / x;
 		break;
 	case FE_INEXACT:
 		x = small, x += one;
 		break;
 	case FE_OVERFLOW | FE_INEXACT:
 		x = large, x *= x;
 		break;
 	case FE_UNDERFLOW | FE_INEXACT:
 		x = small, x *= x;
 		break;
 	default:
 		/*
 		 * if more than one exception exists, a sledgehammer is viable
 		 */
 		__raisegeneric(excepts);
 		break;
 	}
}

int
feclearexcept(int excepts)
{
 	__cleargeneric(excepts & FE_ALL_EXCEPT);
 	return 0;
}

int
feraiseexcept(int excepts)
{
 	__raisecleverly(excepts & FE_ALL_EXCEPT);
 	return 0;
}

int
fetestexcept(int excepts)
{
 	return (int) (fe_get_sr() & (excepts & FE_ALL_EXCEPT));
}

int
fegetround(void)
{
 	return (int) (fe_get_cr() & ROUND_MASK);
}

int
fesetround(int rounding_mode)
{
 	if ((rounding_mode & ~ROUND_MASK) != 0)
 		return -1;

 	fe_set_cr((fe_get_cr() & ~ROUND_MASK) | (rounding_mode & ROUND_MASK));
 	return 0;
}

int
fegetenv(fenv_t *envp)
{
 	fe_get_e(envp);
 	return 0;
}

int
fesetenv(fenv_t *envp)
{
 	const fenv_t envpd = FE_DFL_ENV_DATA;
 	const fenv_t *e = envp == FE_DFL_ENV ? &envpd : envp;

 	fe_set_e(e);
 	return 0;
}

Regards - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.