Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180923150933.GC10209@port70.net>
Date: Sun, 23 Sep 2018 17:09:33 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: musl@...ts.openwall.com
Subject: [PATCH 0/5] add FP_FAST_FMA to math.h

lightly tested, generated code for new fma inline asm is the same as
with __builtin_fma().

doing runtime dispatch for fma on x86 is tricky (requires cpuid
handling), on arm it's enough to check the VFPv4 hwcap.

on other targets fma is part of the base isa or not available at all.

(for various math code it's important to have FP_FAST_FMA
defined: when hw fma allows much more efficent implementation.
some such code use __FP_FAST_FMA of gcc with __builtin_fma,
but that does not work with clang, bionic does not seem to
have correct setting for FP_FAST_FMA at all, there are other
problematic platforms, but i think musl should support it.

it's annoying that there is no way to tell using preprocessing
macros which standard functions are treated as builtins by the
compiler and then which builtins may get inlined.  users will
invent their own fast fma ifdef hacks which is bad.)

Szabolcs Nagy (5):
  s390x: add single instruction fma and fmaf
  powerpc: add single instruction fabs, fabsf, fma, fmaf, sqrt, sqrtf
  arm: add single instruction fma
  x86_64: add single instruction fma
  define FP_FAST_FMA and FP_FAST_FMAF when fma and fmaf can be inlined

 arch/aarch64/bits/math.h   |  2 ++
 arch/arm/bits/math.h       |  4 ++++
 arch/generic/bits/math.h   |  0
 arch/powerpc/bits/math.h   |  4 ++++
 arch/powerpc64/bits/math.h |  2 ++
 arch/s390x/bits/math.h     |  2 ++
 arch/x32/bits/math.h       |  4 ++++
 arch/x86_64/bits/math.h    |  4 ++++
 include/math.h             |  2 ++
 src/math/arm/fma.c         | 15 +++++++++++++++
 src/math/arm/fmaf.c        | 15 +++++++++++++++
 src/math/powerpc/fabs.c    | 15 +++++++++++++++
 src/math/powerpc/fabsf.c   | 15 +++++++++++++++
 src/math/powerpc/fma.c     | 15 +++++++++++++++
 src/math/powerpc/fmaf.c    | 15 +++++++++++++++
 src/math/powerpc/sqrt.c    | 15 +++++++++++++++
 src/math/powerpc/sqrtf.c   | 15 +++++++++++++++
 src/math/s390x/fma.c       |  7 +++++++
 src/math/s390x/fmaf.c      |  7 +++++++
 src/math/x32/fma.c         | 23 +++++++++++++++++++++++
 src/math/x32/fmaf.c        | 23 +++++++++++++++++++++++
 src/math/x86_64/fma.c      | 23 +++++++++++++++++++++++
 src/math/x86_64/fmaf.c     | 23 +++++++++++++++++++++++
 23 files changed, 250 insertions(+)
 create mode 100644 arch/aarch64/bits/math.h
 create mode 100644 arch/arm/bits/math.h
 create mode 100644 arch/generic/bits/math.h
 create mode 100644 arch/powerpc/bits/math.h
 create mode 100644 arch/powerpc64/bits/math.h
 create mode 100644 arch/s390x/bits/math.h
 create mode 100644 arch/x32/bits/math.h
 create mode 100644 arch/x86_64/bits/math.h
 create mode 100644 src/math/arm/fma.c
 create mode 100644 src/math/arm/fmaf.c
 create mode 100644 src/math/powerpc/fabs.c
 create mode 100644 src/math/powerpc/fabsf.c
 create mode 100644 src/math/powerpc/fma.c
 create mode 100644 src/math/powerpc/fmaf.c
 create mode 100644 src/math/powerpc/sqrt.c
 create mode 100644 src/math/powerpc/sqrtf.c
 create mode 100644 src/math/s390x/fma.c
 create mode 100644 src/math/s390x/fmaf.c
 create mode 100644 src/math/x32/fma.c
 create mode 100644 src/math/x32/fmaf.c
 create mode 100644 src/math/x86_64/fma.c
 create mode 100644 src/math/x86_64/fmaf.c

-- 
2.18.0

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.