Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BD7773622145634B952E5B54ACA8E349AA24C18C@PUMAIL01.pu.imgtec.org>
Date: Wed, 30 Mar 2016 09:45:59 +0000
From: Jaydeep Patil <Jaydeep.Patil@...tec.com>
To: Rich Felker <dalias@...c.org>
CC: "musl@...ts.openwall.com" <musl@...ts.openwall.com>
Subject: RE: [PATCH] Fix atomic_arch.h for MIPS32 R6

>-----Original Message-----
>From: Rich Felker [mailto:dalias@...ifal.cx] On Behalf Of Rich Felker
>Sent: 29 March 2016 PM 07:03
>To: Jaydeep Patil
>Cc: musl@...ts.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Tue, Mar 29, 2016 at 07:16:46AM +0000, Jaydeep Patil wrote:
>> >-----Original Message-----
>> >From: Rich Felker [mailto:dalias@...ifal.cx] On Behalf Of Rich Felker
>> >Sent: 29 March 2016 AM 09:41
>> >To: Jaydeep Patil
>> >Cc: musl@...ts.openwall.com
>> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >
>> >On Tue, Mar 29, 2016 at 03:54:02AM +0000, Jaydeep Patil wrote:
>> >> >-----Original Message-----
>> >> >From: Rich Felker [mailto:dalias@...ifal.cx] On Behalf Of Rich
>> >> >Felker
>> >> >Sent: 28 March 2016 PM 06:35
>> >> >To: Jaydeep Patil
>> >> >Cc: musl@...ts.openwall.com
>> >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >> >
>> >> >On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
>> >> >> >> >I was just saying it makes the code less cluttered to use
>> >> >> >> >them spuriously even though we don't need to:
>> >> >> >> >
>> >> >> >> >		".set push ; "
>> >> >> >> >#if __mips_isa_rev < 6
>> >> >> >> >		".set mips2 ; "
>> >> >> >> >#endif
>> >> >> >> >		"ll %0, %1 ; .set pop"
>> >> >> >> >
>> >> >> >> >or similar.
>> >> >> >> >
>> >> >> >> >It's also not clear to me whether the "m" constraint is
>> >> >> >> >valid anymore for the R6 ll/sc instructions since they take
>> >> >> >> >a 9-bit offset now instead of a
>> >> >> >16-bit offset.
>> >> >> >> >The compiler could generate an address expression whose
>> >> >> >> >offset part does not fit in 9 bits. In that case we may need
>> >> >> >> >to #if the whole function (or at least the __asm__
>> >> >> >> >statement) separately rather than just
>> >> >> >skipping the .set mips2....
>> >> >> >> >
>> >> >> >>
>> >> >> >> The "m" constrain is still valid here, as the offset will be 0 in this
>case..
>> >> >> >
>> >> >> >How can you assume the offset will be 0? It's the compiler's
>> >> >> >choice what to use. For instance, a_cas(&foo->bar, t, s) is
>> >> >> >likely to have an offset equal to
>> >> >> >offsetof(__typeof__(foo),bar). AFAIK this happens in practice
>> >> >> >with small offsets in mutex structures, etc. so the bug may be
>> >> >> >unlikely to be hit, but I think it's still an incorrect-
>> >constraint bug.
>> >> >>
>> >> >> Compiler generates appropriate LL/SC based on the offset.
>> >> >> Compiler adds the offset to the base register if it does not fit 9bits.
>> >> >
>> >> >The compiler has no way of knowing that the operand will be used
>> >> >with ll with the 9-bit offset restriction; as far as it knows, it
>> >> >will be used in a normal context where a 16-bit offset is valid. I
>> >> >don't have a toolchain that will target r6, but you can try the
>> >> >following program which produces an offset of 4096 for loading p[1024]:
>> >> >
>> >> >unsigned ll1k(volatile unsigned *p) {
>> >> >	unsigned val;
>> >> >	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
>> >> >"memory" );
>> >> >	return val;
>> >> >}
>> >> >
>> >> >I would expect this to produce errors at assembly time on r6.
>> >> >Rich
>> >>
>> >> This is what compiler has generated for above function:
>> >>
>> >> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32
>> >>
>> >> Objdump:
>> >>
>> >> 00000000 <ll1k>:
>> >>    0:   24821000        addiu   v0,a0,4096
>> >>    4:   7c420036        ll      v0,0(v0)
>> >>    8:   d81f0000        jrc     ra
>> >>    c:   00000000        nop
>> >
>> >Can you try gcc -S instead of -c (still at -O3) to produce asm output
>> >without assembling it?
>>
>> Generated asssembly:
>>
>> #APP
>>  # 4 "test.c" 1
>>         ll $2, 4096($4)
>>  # 0 "" 2
>> #NO_APP
>>         jrc     $31
>>
>> Even if we set "noreorder" before LL, assembler generates addiu+ll:
>>
>> 00000000 <ll1k>:
>>    0:   24821000        addiu   v0,a0,4096
>>    4:   7c420036        ll      v0,0(v0)
>>    8:   d81f0000        jrc     ra
>>    c:   00000000        nop
>
>I see. I suspected the assembler was doing it. "noat", not "noreorder", is the
>way to suppress things like this but I doubt even "noat" does it since a
>separate temp register ("at") is not needed in this case.
>
>If all assembers that support R6 support this rewriting, then the ZC constraint
>in gcc is really just an optimization, not strictly necessary. We should probably
>check (1) whether clang's internal assembler can do the rewriting, and (2)
>whether clang supports the ZC constraint. I would prefer using ZC but I want
>to do whatever is more compatible; I don't think the codegen efficiency
>matters a lot either way.
>Rich

Clang's integrated assembler does not support this rewriting. However ZC is supported.
I have modified both atomic_arch.h and pthread_arch.h to reflect this. 
Please refer to https://github.com/JaydeepIMG/musl-1/tree/fix_inline_asm_for_R6 for the patch (also listed below).
I have also added R6 as subarch.



>From 20054ee55643d9e81163ca58ac63cc38b5080969 Mon Sep 17 00:00:00 2001
From: Jaydeep Patil <jaydeep.patil@...tec.com>
Date: Wed, 30 Mar 2016 10:37:30 +0100
Subject: [PATCH] [MIPS] Update inline asm for R6 and add R6 as subtarget

---
 arch/mips/atomic_arch.h    | 17 +++--------------
 arch/mips/pthread_arch.h   |  8 +-------
 arch/mips64/atomic_arch.h  | 12 +++++-------
 arch/mips64/pthread_arch.h |  7 +------
 configure                  |  2 ++
 5 files changed, 12 insertions(+), 34 deletions(-)

diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
index ce2823b..4dbe4bb 100644
--- a/arch/mips/atomic_arch.h
+++ b/arch/mips/atomic_arch.h
@@ -3,10 +3,8 @@ static inline int a_ll(volatile int *p)
 {
 	int v;
 	__asm__ __volatile__ (
-		".set push ; .set mips2\n\t"
 		"ll %0, %1"
-		"\n\t.set pop"
-		: "=r"(v) : "m"(*p));
+		: "=r"(v) : "ZC"(*p));
 	return v;
 }
 
@@ -15,24 +13,15 @@ static inline int a_sc(volatile int *p, int v)
 {
 	int r;
 	__asm__ __volatile__ (
-		".set push ; .set mips2\n\t"
 		"sc %0, %1"
-		"\n\t.set pop"
-		: "=r"(r), "=m"(*p) : "0"(v) : "memory");
+		: "=r"(r), "=ZC"(*p) : "0"(v) : "memory");
 	return r;
 }
 
 #define a_barrier a_barrier
 static inline void a_barrier()
 {
-	/* mips2 sync, but using too many directives causes
-	 * gcc not to inline it, so encode with .long instead. */
-	__asm__ __volatile__ (".long 0xf" : : : "memory");
-#if 0
-	__asm__ __volatile__ (
-		".set push ; .set mips2 ; sync ; .set pop"
-		: : : "memory");
-#endif
+	__asm__ __volatile__ ("sync" : : : "memory");
 }
 
 #define a_pre_llsc a_barrier
diff --git a/arch/mips/pthread_arch.h b/arch/mips/pthread_arch.h
index 8a49965..d8b6955 100644
--- a/arch/mips/pthread_arch.h
+++ b/arch/mips/pthread_arch.h
@@ -1,13 +1,7 @@
 static inline struct pthread *__pthread_self()
 {
-#ifdef __clang__
-	char *tp;
-	__asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
-#else
 	register char *tp __asm__("$3");
-	/* rdhwr $3,$29 */
-	__asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
-#endif
+	__asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp));
 	return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
 }
 
diff --git a/arch/mips64/atomic_arch.h b/arch/mips64/atomic_arch.h
index b468fd9..ac92891 100644
--- a/arch/mips64/atomic_arch.h
+++ b/arch/mips64/atomic_arch.h
@@ -4,7 +4,7 @@ static inline int a_ll(volatile int *p)
 	int v;
 	__asm__ __volatile__ (
 		"ll %0, %1"
-		: "=r"(v) : "m"(*p));
+		: "=r"(v) : "ZC"(*p));
 	return v;
 }
 
@@ -14,7 +14,7 @@ static inline int a_sc(volatile int *p, int v)
 	int r;
 	__asm__ __volatile__ (
 		"sc %0, %1"
-		: "=r"(r), "=m"(*p) : "0"(v) : "memory");
+		: "=r"(r), "=ZC"(*p) : "0"(v) : "memory");
 	return r;
 }
 
@@ -24,7 +24,7 @@ static inline void *a_ll_p(volatile void *p)
 	void *v;
 	__asm__ __volatile__ (
 		"lld %0, %1"
-		: "=r"(v) : "m"(*(void *volatile *)p));
+		: "=r"(v) : "ZC"(*(void *volatile *)p));
 	return v;
 }
 
@@ -34,16 +34,14 @@ static inline int a_sc_p(volatile void *p, void *v)
 	long r;
 	__asm__ __volatile__ (
 		"scd %0, %1"
-		: "=r"(r), "=m"(*(void *volatile *)p) : "0"(v) : "memory");
+		: "=r"(r), "=ZC"(*(void *volatile *)p) : "0"(v) : "memory");
 	return r;
 }
 
 #define a_barrier a_barrier
 static inline void a_barrier()
 {
-	/* mips2 sync, but using too many directives causes
-	 * gcc not to inline it, so encode with .long instead. */
-	__asm__ __volatile__ (".long 0xf" : : : "memory");
+	__asm__ __volatile__ ("sync" : : : "memory");
 }
 
 #define a_pre_llsc a_barrier
diff --git a/arch/mips64/pthread_arch.h b/arch/mips64/pthread_arch.h
index b42edbe..d8b6955 100644
--- a/arch/mips64/pthread_arch.h
+++ b/arch/mips64/pthread_arch.h
@@ -1,12 +1,7 @@
 static inline struct pthread *__pthread_self()
 {
-#ifdef __clang__
-	char *tp;
-	__asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
-#else
 	register char *tp __asm__("$3");
-	__asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
-#endif
+	__asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp));
 	return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
 }
 
diff --git a/configure b/configure
index 213a825..969671d 100755
--- a/configure
+++ b/configure
@@ -612,11 +612,13 @@ trycppif __AARCH64EB__ "$t" && SUBARCH=${SUBARCH}_be
 fi
 
 if test "$ARCH" = "mips" ; then
+trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
 trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
 trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
 fi
 
 if test "$ARCH" = "mips64" ; then
+trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
 trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
 trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
 fi
-- 
2.1.4


Thanks,
Jaydeep

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.