musl - add loongarch64 port v9.

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7aad7a07-9762-3d62-a8c2-4cdf860a7dcb@loongson.cn>
Date: Thu, 16 Nov 2023 10:54:44 +0800
From: Hongliang Wang <wanghongliang@...ngson.cn>
To: musl@...ts.openwall.com
Subject: add loongarch64 port v9.

Hi,

Thank you for your suggestion, I have modified the dynamic linker
name according to the basic ABI types are specified in the ABI
document of the LoongArch, and post 0001-add-loongarch64-port-v9.patch,
as shown in the attachment.

Based on 0001-add-loongarch64-port-v8.patch,the modifications for
0001-add-loongarch64-port-v9.patch are as follows:

---
  arch/loongarch64/reloc.h | 10 ++++++----
  configure                |  4 +++-
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/loongarch64/reloc.h b/arch/loongarch64/reloc.h
index a4482b48..6907de8e 100644
--- a/arch/loongarch64/reloc.h
+++ b/arch/loongarch64/reloc.h
@@ -1,7 +1,9 @@
-#ifdef __loongarch_soft_float
-#define FP_SUFFIX "-sf"
-#else
-#define FP_SUFFIX ""
+#if defined __loongarch_double_float
+#define FP_SUFFIX "-lp64d"
+#elif defined __loongarch_single_float
+#define FP_SUFFIX "-lp64f"
+#elif defined __loongarch_soft_float
+#define FP_SUFFIX "-lp64s"
  #endif

  #define LDSO_ARCH "loongarch64"  FP_SUFFIX
diff --git a/configure b/configure
index 55d179f1..93b06287 100755
--- a/configure
+++ b/configure
@@ -673,7 +673,9 @@ trycppif __AARCH64EB__ "$t" && SUBARCH=${SUBARCH}_be
  fi

  if test "$ARCH" = "loongarch64" ; then
-trycppif __loongarch_soft_float "$t" && SUBARCH=${SUBARCH}-sf
+trycppif __loongarch_double_float "$t" && SUBARCH=${SUBARCH}-lp64d
+trycppif __loongarch_single_float "$t" && SUBARCH=${SUBARCH}-lp64f
+trycppif __loongarch_soft_float "$t" && SUBARCH=${SUBARCH}-lp64s
  printf "checking whether compiler support FCSRs... "
  echo "__asm__(\"movfcsr2gr \$t0,\$fcsr0\");" > "$tmpc"
  if $CC -c -o /dev/null "$tmpc" >/dev/null 2>&1 ; then
-- 

Please review again, and point them out if any questions need to be
modified, thanks.


Hongliang Wang

在 2023/11/14 下午9:16, Jingyun Hua 写道:
> hi,
> 
> I have a suggestion regarding the musl dynamic linker name for the
> LoongArch architecture:
> 
> The basic ABI types are specified in the ABI document of the LoongArch,
> which can be seen:
> https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html 
> 
> and the dynamic linker name of glibc distinguishes lp64d, lp64s and
> lp64f with abi types.The v8 patch only defines the value of SUBARCH
> as "-sf" for “__loongarch_soft_float”, which means that only
> “ld-musl-loongarch64.so.1” or “ld-musl-loongarch64-sf.so.1” can be
> generated when compiled with the v8 patch.
> 
> So I think the musl dynamic linker name for the LoongArch architecture
> needs to be modified, maybe defined like glibc?
> 
> Regards,
> Jingyun Hua
> 
> On 11/9/23 9:15 PM, Jingyun Hua wrote:
>> Hi all,
>>
>> Thank you for taking the time to check this email. I'm very interested
>> in musl and alpine, and I'd like to know the current status of this
>> patch and what we can do for it next.
>>
>> After Hongliang updated this patch from v7 to v8, in order to verify it,
>> I built the alpine linux from scratch base on this musl v8 patch on the
>> LoongArch64 machine, and successfully compiled projects such as gcc,
>> rust, llvm, go, etc.. Currently, more than 8,500 packages have been
>> built and running normally in my local repository.
>>
>> The patch has indeed been on hold for a long time. Is it in the
>> immediate plans? Is there anything that needs to be modified and
>> improved?
>>
>> Looking forward to your reply, thanks!
>>
>> Regards,
>> Jingyun Hua
>>
>> 在 2023/9/26 上午11:28, Hongliang Wang 写道:
>>> Hi,
>>>
>>> I have fixed the issues listed below, and post
>>> 0001-add-loongarch64-port-v8.patch,as shown in the attachment.
>>> please review again, if there are anything that need to be modified,
>>> please point out, thank you.
>>>
>>> Hongliang Wang
>>>
>>> 在 2023/9/22 上午9:37, Hongliang Wang 写道:
>>>> Hi,
>>>>
>>>> Thank you for your review, I will modify it according to the review
>>>> suggestions and post v8 patch later.
>>>>
>>>> Hongliang Wang
>>>>
>>>> 在 2023/9/20 下午9:16, Szabolcs Nagy 写道:
>>>>> * Jianmin Lv <lvjianmin@...ngson.cn> [2023-09-20 15:45:39 +0800]:
>>>>>> Sorry to bother you, but I just want to know if there is any 
>>>>>> progress on
>>>>>> this, because Alpine is blocking by this patch for a long time. As 
>>>>>> Hongliang
>>>>>> has explained questions you mentioned one by one, if any question 
>>>>>> need to be
>>>>>> discussed, please point them out, so that the patch can be handled 
>>>>>> further.
>>>>>
>>>>> i think you should post a v8 patch to move
>>>>> this forward. (not on github, but here)
>>>>>
>>>>>> On 2023/8/15 下午4:17, Hongliang Wang wrote:
>>>>>>> 在 2023/8/13 上午9:41, Rich Felker 写道:
>>>>>>>> On Sat, Aug 05, 2023 at 11:43:08AM -0400, Rich Felker wrote:
>>>>>>>>> On Sat, Aug 05, 2023 at 02:18:35PM +0800, 翟小娟 wrote:
>>>>>>>>> -+#define __BYTE_ORDER 1234
>>>>>>>>> ++#define __BYTE_ORDER  __LITTLE_ENDIAN
>>>>>>>>
>>>>>>>> This is gratuitous, mismatches what is done on other archs, and is
>>>>>>>> less safe.
>>>>>>>>
>>>>>>> The modification is based on the following review suggestion(
>>>>>>> WANG Xuerui <i@...0n.name> reviewed in
>>>>>>> https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>>>
>>>>>>> `#define __BYTE_ORDER __LITTLE_ENDIAN` could be more consistent
>>>>>>> with other arches.
>>>>>
>>>>> please use 1234.
>>>
>>> --- a/arch/loongarch64/bits/alltypes.h.in
>>> +++ b/arch/loongarch64/bits/alltypes.h.in
>>> @@ -2,7 +2,7 @@
>>>   #define _Int64 long
>>>   #define _Reg   long
>>>
>>> -#define __BYTE_ORDER  __LITTLE_ENDIAN
>>> +#define __BYTE_ORDER  1234
>>>   #define __LONG_MAX    0x7fffffffffffffffL
>>>
>>>
>>>>>
>>>>>>>>> -+TYPEDEF unsigned nlink_t;
>>>>>>>>> ++TYPEDEF unsigned int nlink_t;
>>>>>>>>
>>>>>>>> Gratuitous and in opposite direction of coding style used 
>>>>>>>> elsewhere in
>>>>>>>> musl. There are a few other instances of this too.
>>>>>>>>
>>>>>>> Based on the following review question(WANG Xuerui <i@...0n.name>
>>>>>>> reviewed in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>>>
>>>>>>> `unsigned int`? Same for other bare `unsigned` usages.
>>>>>>>
>>>>>>> I fixed it to a explicit definition.
>>>>>
>>>>> please use plain unsigned, not unsigned int.
>>>
>>> --- a/arch/loongarch64/bits/alltypes.h.in
>>> +++ b/arch/loongarch64/bits/alltypes.h.in
>>> @@ -2,7 +2,7 @@
>>>
>>> -TYPEDEF unsigned int nlink_t;
>>> +TYPEDEF unsigned nlink_t;
>>>   TYPEDEF int blksize_t;
>>>
>>>>>
>>>>>>>>> -+      register uintptr_t tp __asm__("tp");
>>>>>>>>> -+      __asm__ ("" : "=r" (tp) );
>>>>>>>>> ++      uintptr_t tp;
>>>>>>>>> ++      __asm__ __volatile__("move %0, $tp" : "=r"(tp));
>>>>>>>>>    +      return tp;
>>>>>>>>
>>>>>>>> Not clear what the motivation is here. Is it working around a 
>>>>>>>> compiler
>>>>>>>> bug? The original form was more optimal.
>>>>>>>
>>>>>>> The modification is based on the following review suggestion(
>>>>>>> WANG Xuerui <i@...0n.name> reviewed in
>>>>>>> https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>>>
>>>>>>> While the current approach works, it's a bit fragile [1], and
>>>>>>> the simple and plain riscv version works too:
>>>>>>>
>>>>>>> uintptr_t tp;
>>>>>>> __asm__ ("move %0, $tp" : "=r"(tp));
>>>>>>>
>>>>>>> [1]:https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables 
>>>>>>>
>>>>>
>>>>> this looks ok to me.
>>>>>
>>>>> (original code looks sketchy to me.)
>>>>>
>>>>>>>>>    +#define CRTJMP(pc,sp) __asm__ __volatile__( \
>>>>>>>>> -+      "move $sp,%1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
>>>>>>>>> -+
>>>>>>>>> -+#define GETFUNCSYM(fp, sym, got) __asm__ ( \
>>>>>>>>> -+      ".hidden " #sym "\n" \
>>>>>>>>> -+      ".align 8 \n" \
>>>>>>>>> -+      "       la.local $t1, "#sym" \n" \
>>>>>>>>> -+      "       move %0, $t1 \n" \
>>>>>>>>> -+      : "=r"(*(fp)) : : "memory" )
>>>>>>>>> ++      "move $sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
>>>>>>>>
>>>>>>>> Not clear why this was changed. It was never discussed afaik. It 
>>>>>>>> looks
>>>>>>>> somewhat dubious removing GETFUNCSYM and using the maybe-unreliable
>>>>>>>> default definition.
>>>>>>>>
>>>>>>> Based on the following review question(
>>>>>>> WANG Xuerui <i@...0n.name> reviewed
>>>>>>> in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>>>
>>>>>>> Does the generic version residing in ldso/dlstart.c not work?
>>>>>>>
>>>>>>> I found the code logic is consistent with the generic version, So
>>>>>>> I removed the definition here and replaced it with generic version.
>>>>>
>>>>> i would change it back to the asm.
>>>>>
>>>>> generic version should work, but maybe we don't
>>>>> want to trust the compiler here (there may be
>>>>> ways to compile the generic code that is not
>>>>> compatible with incompletely relocated libc.so)
>>>>> the asm is safer.
>>>>>
>>> --- a/arch/loongarch64/reloc.h
>>> +++ b/arch/loongarch64/reloc.h
>>> @@ -18,3 +18,10 @@
>>>
>>>   #define CRTJMP(pc,sp) __asm__ __volatile__( \
>>>          "move $sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
>>> +
>>> +#define GETFUNCSYM(fp, sym, got) __asm__ ( \
>>> +       ".hidden " #sym "\n" \
>>> +       ".align 8 \n" \
>>> +       "       la.local $t1, "#sym" \n" \
>>> +       "       move %0, $t1 \n" \
>>> +       : "=r"(*(fp)) : : "memory" )
>>>
>>>>>>>> It also looks like v5->v6 rewrote sigsetjmp.
>>>>>>>>
>>>>>>> Based on the following review question(
>>>>>>> WANG Xuerui <i@...0n.name> reviewed
>>>>>>> in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>>>
>>>>>>> This is crazy complicated compared to the riscv port, why is the
>>>>>>> juggling between a0/a1 and t5/t6 necessary?
>>>>>>>
>>>>>>> I optimized the code implementations.
>>>>>
>>>>> this should be ok.
>>>>>
>>>>>>>> v6->v7:
>>>>>>>>
>>>>>>>> sigcontext/mcontext_t change mostly makes sense, but the
>>>>>>>> namespace-safe and full mcontext_t differ in their use of the 
>>>>>>>> aligned
>>>>>>>> attribute and it's not clear that the attribute is needed (since 
>>>>>>>> the
>>>>>>>> offset is naturally aligned and any instance of the object is 
>>>>>>>> produced
>>>>>>>> by the kernel, not by userspace).
>>>>>>>>
>>>>>>>>> -+      long               __uc_pad;
>>>>>>>>
>>>>>>>> This change to ucontext_t actually makes the struct 
>>>>>>>> ABI-incompatible
>>>>>>>> in the namespace-safe version without the alignment attribute. 
>>>>>>>> IIRC I
>>>>>>>> explicitly requested the explicit padding field to avoid this 
>>>>>>>> kind of
>>>>>>>> footgun. Removing it does not seem like a good idea.
>>>>>>>>
>>>>>>> Initially, we add __uc_pad to ensure uc_mcontext is 16 byte 
>>>>>>> alignment.
>>>>>>> Now, we added __attribute__((__aligned__(16))) to
>>>>>>> uc_mcontext.__extcontext[],this can ensure uc_mcontext is also 16 
>>>>>>> byte
>>>>>>> alignment. so __uc_pad is not used.
>>>>>>>
>>>>>>> Remove __uc_pad, from the point of struct layout, musl and kernel 
>>>>>>> are
>>>>>>> consistent. otherwise, I think it may bring a sense of inconsistency
>>>>>>> between kernel and musl. Due to Loongarch is not merged into musl 
>>>>>>> now,
>>>>>>> Remove it no compatibility issues.
>>>>>
>>>>> please add back the padding field.
>>>
>>> --- a/arch/loongarch64/bits/signal.h
>>> +++ b/arch/loongarch64/bits/signal.h
>>> @@ -40,6 +40,7 @@ typedef struct __ucontext
>>>          struct __ucontext  *uc_link;
>>>          stack_t            uc_stack;
>>>          sigset_t           uc_sigmask;
>>> +       long               __uc_pad;
>>>          mcontext_t         uc_mcontext;
>>>   } ucontext_t;
>>>
>>>>>
>>>>>>>> In stat.h:
>>>>>>>>
>>>>>>>>> -+      unsigned long __pad;
>>>>>>>>> ++      unsigned long __pad1;
>>>>>>>>
>>>>>>>> This is gratuitous and makes the definition gratuitously 
>>>>>>>> mismatch what
>>>>>>>> will become the "generic" version of bits/stat.h. There is no 
>>>>>>>> contract
>>>>>>>> for applications to be able to access these padding fields by 
>>>>>>>> name, so
>>>>>>>> no motivation to make their names match glibc's names or the 
>>>>>>>> kernel's.
>>>>>>>>
>>>>>>> OK.
>>>>>
>>>>> please fix it.
>>>
>>> --- a/arch/loongarch64/bits/stat.h
>>> +++ b/arch/loongarch64/bits/stat.h
>>> @@ -6,7 +6,7 @@ struct stat {
>>>          uid_t st_uid;
>>>          gid_t st_gid;
>>>          dev_t st_rdev;
>>> -       unsigned long __pad1;
>>> +       unsigned long __pad;
>>>          off_t st_size;
>>>          blksize_t st_blksize;
>>>          int __pad2;
>>>
>>>>>
>>>>>>>> In fenv.S:
>>>>>>>>
>>>>>>>>> ++#ifdef __clang__
>>>>>>>>> ++#define FCSR $fcsr0
>>>>>>>>> ++#else
>>>>>>>>> ++#define FCSR $r0
>>>>>>>>> ++#endif
>>>>>>>>
>>>>>>>> It's not clear to me what's going on here. Is there a clang
>>>>>>>> incompatibility in the assembler language you're working around? Or
>>>>>>>> what? If so that seems like a tooling bug that should be fixed.
>>>>>>>>
>>>>>>> The GNU assembler cannot correctly recognize $fcsr0, but the
>>>>>>> LLVM IAS does not have this issue, so make a distinction.
>>>>>>> This issue has been fixed in GNU assembler 2.41. but for compatible
>>>>>>> with GNU assembler 2.40 and below, $r0 need reserved.
>>>>>
>>>>> it sounds like the correct asm is $fcsr0, so that's
>>>>> what should be used on all assemblers, not just on
>>>>> clang as.
>>>>>
>>>>> only broken old gnu as should use different syntax.
>>>>> for this a configure test is needed that adds a
>>>>> CFLAG like -DBROKEN_LOONGARCH_FCSR_ASM when fails.
>>>>> and use that for the ifdef.
>>>>>
>>> --- a/configure
>>> +++ b/configure
>>>
>>> if test "$ARCH" = "loongarch64" ; then
>>> trycppif __loongarch_soft_float "$t" && SUBARCH=${SUBARCH}-sf
>>> +printf "checking whether compiler support FCSRs... "
>>> +echo "__asm__(\"movfcsr2gr \$t0,\$fcsr0\");" > "$tmpc"
>>> +if $CC -c -o /dev/null "$tmpc" >/dev/null 2>&1 ; then
>>> +printf "yes\n"
>>> +else
>>> +printf "no\n"
>>> +CFLAGS_AUTO="$CFLAGS_AUTO -DBROKEN_LOONGARCH_FCSR_ASM"
>>> +fi
>>> fi
>>>
>>> --- a/src/fenv/loongarch64/fenv.S
>>> +++ b/src/fenv/loongarch64/fenv.S
>>> @@ -1,9 +1,9 @@
>>>   #ifndef __loongarch_soft_float
>>>
>>> -#ifdef __clang__
>>> -#define FCSR $fcsr0
>>> -#else
>>> +#ifdef BROKEN_LOONGARCH_FCSR_ASM
>>>   #define FCSR $r0
>>> +#else
>>> +#define FCSR $fcsr0
>>>   #endif
>>>
>>>   .global        feclearexcept
>>>
>>>>>>>
>>>>>>> The linux kernel also has a similar distinction:
>>>>>>>
>>>>>>> commit 38bb46f94544c5385bc35aa2bfc776dcf53a7b5d
>>>>>>> Author: WANG Xuerui <git@...0n.name>
>>>>>>> Date:   Thu Jun 29 20:58:43 2023 +0800
>>>>>>>
>>>>>>>      LoongArch: Prepare for assemblers with proper FCSR class 
>>>>>>> support
>>>>>>>
>>>>>>>      The GNU assembler (as of 2.40) mis-treats FCSR operands as 
>>>>>>> GPRs, but
>>>>>>>      the LLVM IAS does not. Probe for this and refer to FCSRs 
>>>>>>> as"$fcsrNN"
>>>>>>>      if support is present.
>>>>>>>
>>>>>>>      Signed-off-by: WANG Xuerui <git@...0n.name>
>>>>>>>      Signed-off-by: Huacai Chen <chenhuacai@...ngson.cn>
>>>>>>>
>>>>>>>> Otherwise, everything looks as expected, I think. I'm okay with 
>>>>>>>> making
>>>>>>>> any fixups for merging rather than throwing this back on your 
>>>>>>>> team for
>>>>>>>> more revisions, but can you please follow up and clarify the above?
>>>>>
>>>>> all issues look minor.
>>>>>
>>>>> if you post a v8 it likely gets into a release faster.
>>>>>
>>
> 

View attachment "0001-add-loongarch64-port-v9.patch" of type "text/x-patch" (45266 bytes)
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.