musl - Re: [PATCH] math: add LoongArch support for common APIs with inline assembly.

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 23 Apr 2024 11:56:43 -0400
From: Rich Felker <dalias@...c.org>
To: ticat_fp <fanpeng@...ngson.cn>
Cc: musl@...ts.openwall.com, lixing@...ngson.cn, huajingyun@...ngson.cn,
	wanghongliang@...ngson.cn
Subject: Re: [PATCH] math: add LoongArch support for common APIs with
 inline assembly.

On Tue, Apr 23, 2024 at 10:26:19AM +0800, ticat_fp wrote:
> Including: ceil, copysign, fabs, floor, fma, fmax, fmin, llrint,
> lrint, rint, sqrt and their f versions.
> 
> ---
>  src/math/loongarch64/ceil.c      | 25 +++++++++++++++++++++++++
>  src/math/loongarch64/ceilf.c     | 25 +++++++++++++++++++++++++
>  src/math/loongarch64/copysign.c  |  7 +++++++
>  src/math/loongarch64/copysignf.c |  7 +++++++
>  src/math/loongarch64/fabs.c      |  7 +++++++
>  src/math/loongarch64/fabsf.c     |  7 +++++++
>  src/math/loongarch64/floor.c     | 22 ++++++++++++++++++++++
>  src/math/loongarch64/floorf.c    | 22 ++++++++++++++++++++++
>  src/math/loongarch64/fma.c       |  7 +++++++
>  src/math/loongarch64/fmaf.c      |  7 +++++++
>  src/math/loongarch64/fmax.c      |  7 +++++++
>  src/math/loongarch64/fmaxf.c     |  7 +++++++
>  src/math/loongarch64/fmin.c      |  7 +++++++
>  src/math/loongarch64/fminf.c     |  7 +++++++
>  src/math/loongarch64/llrint.c    | 17 +++++++++++++++++
>  src/math/loongarch64/llrintf.c   | 17 +++++++++++++++++
>  src/math/loongarch64/lrint.c     | 17 +++++++++++++++++
>  src/math/loongarch64/lrintf.c    | 17 +++++++++++++++++
>  src/math/loongarch64/rint.c      |  7 +++++++
>  src/math/loongarch64/rintf.c     |  7 +++++++
>  src/math/loongarch64/sqrt.c      |  7 +++++++
>  src/math/loongarch64/sqrtf.c     |  7 +++++++
>  22 files changed, 260 insertions(+)
>  create mode 100644 src/math/loongarch64/ceil.c
>  create mode 100644 src/math/loongarch64/ceilf.c
>  create mode 100644 src/math/loongarch64/copysign.c
>  create mode 100644 src/math/loongarch64/copysignf.c
>  create mode 100644 src/math/loongarch64/fabs.c
>  create mode 100644 src/math/loongarch64/fabsf.c
>  create mode 100644 src/math/loongarch64/floor.c
>  create mode 100644 src/math/loongarch64/floorf.c
>  create mode 100644 src/math/loongarch64/fma.c
>  create mode 100644 src/math/loongarch64/fmaf.c
>  create mode 100644 src/math/loongarch64/fmax.c
>  create mode 100644 src/math/loongarch64/fmaxf.c
>  create mode 100644 src/math/loongarch64/fmin.c
>  create mode 100644 src/math/loongarch64/fminf.c
>  create mode 100644 src/math/loongarch64/llrint.c
>  create mode 100644 src/math/loongarch64/llrintf.c
>  create mode 100644 src/math/loongarch64/lrint.c
>  create mode 100644 src/math/loongarch64/lrintf.c
>  create mode 100644 src/math/loongarch64/rint.c
>  create mode 100644 src/math/loongarch64/rintf.c
>  create mode 100644 src/math/loongarch64/sqrt.c
>  create mode 100644 src/math/loongarch64/sqrtf.c
> 
> diff --git a/src/math/loongarch64/ceil.c b/src/math/loongarch64/ceil.c
> new file mode 100644
> index 00000000..95781f4b
> --- /dev/null
> +++ b/src/math/loongarch64/ceil.c
> @@ -0,0 +1,25 @@
> +#include <math.h>
> +#include <stdint.h>
> +
> +double ceil(double x)
> +{
> +    int32_t old;                                                  
> +    int32_t new;                                                  
> +    int32_t tmp1;
> +    int32_t tmp2;
> +
> +    __asm__ __volatile__(                    
> +    "movfcsr2gr %[orig_old],  $r0               \n\t"
> +    "li.d       %[tmp1], 0x200                  \n\t"
> +    "or         %[new],  %[orig_old], %[tmp1]   \n\t"
> +    "li.d       %[tmp2], 0xfffffeff             \n\t"
> +    "and        %[new],  %[new], %[tmp2]        \n\t"
> +    "movgr2fcsr $r0,     %[new]                 \n\t"
> +    "frint.d    %[result],       %[orig_x]      \n\t"
> +    "movgr2fcsr $r0,     %[orig_old]            \n\t"                                                                                                                                     
> +    : [result] "+f"(x), [old]"+r"(old), [new]"+r"(new), [tmp1] "+r"(tmp1), [tmp2] "+r"(tmp2)
> +    : [orig_x] "f"(x), [orig_old]"r"(old), [orig_new]"r"(new), [orig_tmp1] "r"(tmp1), [orig_tmp2] "r"(tmp2)
> +    :);
> +
> +    return x;
> +}

Is it possible to write these with the control register logic in C
rather than a big block of asm?

Also, while probably all versions of gcc and clang with loongarch64
support the named-argument inline asm, we generally don't depend on
this extension in musl. I see how it makes the code more readable with
the big asm block, but if we could get rid of the bit asm block so
that it's just a single asm statement to read the old control register
value, C to modify it, and a pair of instructions (round and restore
control register) taking the argument value and old control register
value to restore as inputs, there wouldn't be any need for them to
make it readable.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.