|
Message-ID: <20180923150933.GC10209@port70.net> Date: Sun, 23 Sep 2018 17:09:33 +0200 From: Szabolcs Nagy <nsz@...t70.net> To: musl@...ts.openwall.com Subject: [PATCH 0/5] add FP_FAST_FMA to math.h lightly tested, generated code for new fma inline asm is the same as with __builtin_fma(). doing runtime dispatch for fma on x86 is tricky (requires cpuid handling), on arm it's enough to check the VFPv4 hwcap. on other targets fma is part of the base isa or not available at all. (for various math code it's important to have FP_FAST_FMA defined: when hw fma allows much more efficent implementation. some such code use __FP_FAST_FMA of gcc with __builtin_fma, but that does not work with clang, bionic does not seem to have correct setting for FP_FAST_FMA at all, there are other problematic platforms, but i think musl should support it. it's annoying that there is no way to tell using preprocessing macros which standard functions are treated as builtins by the compiler and then which builtins may get inlined. users will invent their own fast fma ifdef hacks which is bad.) Szabolcs Nagy (5): s390x: add single instruction fma and fmaf powerpc: add single instruction fabs, fabsf, fma, fmaf, sqrt, sqrtf arm: add single instruction fma x86_64: add single instruction fma define FP_FAST_FMA and FP_FAST_FMAF when fma and fmaf can be inlined arch/aarch64/bits/math.h | 2 ++ arch/arm/bits/math.h | 4 ++++ arch/generic/bits/math.h | 0 arch/powerpc/bits/math.h | 4 ++++ arch/powerpc64/bits/math.h | 2 ++ arch/s390x/bits/math.h | 2 ++ arch/x32/bits/math.h | 4 ++++ arch/x86_64/bits/math.h | 4 ++++ include/math.h | 2 ++ src/math/arm/fma.c | 15 +++++++++++++++ src/math/arm/fmaf.c | 15 +++++++++++++++ src/math/powerpc/fabs.c | 15 +++++++++++++++ src/math/powerpc/fabsf.c | 15 +++++++++++++++ src/math/powerpc/fma.c | 15 +++++++++++++++ src/math/powerpc/fmaf.c | 15 +++++++++++++++ src/math/powerpc/sqrt.c | 15 +++++++++++++++ src/math/powerpc/sqrtf.c | 15 +++++++++++++++ src/math/s390x/fma.c | 7 +++++++ src/math/s390x/fmaf.c | 7 +++++++ src/math/x32/fma.c | 23 +++++++++++++++++++++++ src/math/x32/fmaf.c | 23 +++++++++++++++++++++++ src/math/x86_64/fma.c | 23 +++++++++++++++++++++++ src/math/x86_64/fmaf.c | 23 +++++++++++++++++++++++ 23 files changed, 250 insertions(+) create mode 100644 arch/aarch64/bits/math.h create mode 100644 arch/arm/bits/math.h create mode 100644 arch/generic/bits/math.h create mode 100644 arch/powerpc/bits/math.h create mode 100644 arch/powerpc64/bits/math.h create mode 100644 arch/s390x/bits/math.h create mode 100644 arch/x32/bits/math.h create mode 100644 arch/x86_64/bits/math.h create mode 100644 src/math/arm/fma.c create mode 100644 src/math/arm/fmaf.c create mode 100644 src/math/powerpc/fabs.c create mode 100644 src/math/powerpc/fabsf.c create mode 100644 src/math/powerpc/fma.c create mode 100644 src/math/powerpc/fmaf.c create mode 100644 src/math/powerpc/sqrt.c create mode 100644 src/math/powerpc/sqrtf.c create mode 100644 src/math/s390x/fma.c create mode 100644 src/math/s390x/fmaf.c create mode 100644 src/math/x32/fma.c create mode 100644 src/math/x32/fmaf.c create mode 100644 src/math/x86_64/fma.c create mode 100644 src/math/x86_64/fmaf.c -- 2.18.0
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.