|
Message-ID: <BD7773622145634B952E5B54ACA8E349AA24C18C@PUMAIL01.pu.imgtec.org> Date: Wed, 30 Mar 2016 09:45:59 +0000 From: Jaydeep Patil <Jaydeep.Patil@...tec.com> To: Rich Felker <dalias@...c.org> CC: "musl@...ts.openwall.com" <musl@...ts.openwall.com> Subject: RE: [PATCH] Fix atomic_arch.h for MIPS32 R6 >-----Original Message----- >From: Rich Felker [mailto:dalias@...ifal.cx] On Behalf Of Rich Felker >Sent: 29 March 2016 PM 07:03 >To: Jaydeep Patil >Cc: musl@...ts.openwall.com >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6 > >On Tue, Mar 29, 2016 at 07:16:46AM +0000, Jaydeep Patil wrote: >> >-----Original Message----- >> >From: Rich Felker [mailto:dalias@...ifal.cx] On Behalf Of Rich Felker >> >Sent: 29 March 2016 AM 09:41 >> >To: Jaydeep Patil >> >Cc: musl@...ts.openwall.com >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6 >> > >> >On Tue, Mar 29, 2016 at 03:54:02AM +0000, Jaydeep Patil wrote: >> >> >-----Original Message----- >> >> >From: Rich Felker [mailto:dalias@...ifal.cx] On Behalf Of Rich >> >> >Felker >> >> >Sent: 28 March 2016 PM 06:35 >> >> >To: Jaydeep Patil >> >> >Cc: musl@...ts.openwall.com >> >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6 >> >> > >> >> >On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote: >> >> >> >> >I was just saying it makes the code less cluttered to use >> >> >> >> >them spuriously even though we don't need to: >> >> >> >> > >> >> >> >> > ".set push ; " >> >> >> >> >#if __mips_isa_rev < 6 >> >> >> >> > ".set mips2 ; " >> >> >> >> >#endif >> >> >> >> > "ll %0, %1 ; .set pop" >> >> >> >> > >> >> >> >> >or similar. >> >> >> >> > >> >> >> >> >It's also not clear to me whether the "m" constraint is >> >> >> >> >valid anymore for the R6 ll/sc instructions since they take >> >> >> >> >a 9-bit offset now instead of a >> >> >> >16-bit offset. >> >> >> >> >The compiler could generate an address expression whose >> >> >> >> >offset part does not fit in 9 bits. In that case we may need >> >> >> >> >to #if the whole function (or at least the __asm__ >> >> >> >> >statement) separately rather than just >> >> >> >skipping the .set mips2.... >> >> >> >> > >> >> >> >> >> >> >> >> The "m" constrain is still valid here, as the offset will be 0 in this >case.. >> >> >> > >> >> >> >How can you assume the offset will be 0? It's the compiler's >> >> >> >choice what to use. For instance, a_cas(&foo->bar, t, s) is >> >> >> >likely to have an offset equal to >> >> >> >offsetof(__typeof__(foo),bar). AFAIK this happens in practice >> >> >> >with small offsets in mutex structures, etc. so the bug may be >> >> >> >unlikely to be hit, but I think it's still an incorrect- >> >constraint bug. >> >> >> >> >> >> Compiler generates appropriate LL/SC based on the offset. >> >> >> Compiler adds the offset to the base register if it does not fit 9bits. >> >> > >> >> >The compiler has no way of knowing that the operand will be used >> >> >with ll with the 9-bit offset restriction; as far as it knows, it >> >> >will be used in a normal context where a 16-bit offset is valid. I >> >> >don't have a toolchain that will target r6, but you can try the >> >> >following program which produces an offset of 4096 for loading p[1024]: >> >> > >> >> >unsigned ll1k(volatile unsigned *p) { >> >> > unsigned val; >> >> > __asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) : >> >> >"memory" ); >> >> > return val; >> >> >} >> >> > >> >> >I would expect this to produce errors at assembly time on r6. >> >> >Rich >> >> >> >> This is what compiler has generated for above function: >> >> >> >> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32 >> >> >> >> Objdump: >> >> >> >> 00000000 <ll1k>: >> >> 0: 24821000 addiu v0,a0,4096 >> >> 4: 7c420036 ll v0,0(v0) >> >> 8: d81f0000 jrc ra >> >> c: 00000000 nop >> > >> >Can you try gcc -S instead of -c (still at -O3) to produce asm output >> >without assembling it? >> >> Generated asssembly: >> >> #APP >> # 4 "test.c" 1 >> ll $2, 4096($4) >> # 0 "" 2 >> #NO_APP >> jrc $31 >> >> Even if we set "noreorder" before LL, assembler generates addiu+ll: >> >> 00000000 <ll1k>: >> 0: 24821000 addiu v0,a0,4096 >> 4: 7c420036 ll v0,0(v0) >> 8: d81f0000 jrc ra >> c: 00000000 nop > >I see. I suspected the assembler was doing it. "noat", not "noreorder", is the >way to suppress things like this but I doubt even "noat" does it since a >separate temp register ("at") is not needed in this case. > >If all assembers that support R6 support this rewriting, then the ZC constraint >in gcc is really just an optimization, not strictly necessary. We should probably >check (1) whether clang's internal assembler can do the rewriting, and (2) >whether clang supports the ZC constraint. I would prefer using ZC but I want >to do whatever is more compatible; I don't think the codegen efficiency >matters a lot either way. >Rich Clang's integrated assembler does not support this rewriting. However ZC is supported. I have modified both atomic_arch.h and pthread_arch.h to reflect this. Please refer to https://github.com/JaydeepIMG/musl-1/tree/fix_inline_asm_for_R6 for the patch (also listed below). I have also added R6 as subarch. >From 20054ee55643d9e81163ca58ac63cc38b5080969 Mon Sep 17 00:00:00 2001 From: Jaydeep Patil <jaydeep.patil@...tec.com> Date: Wed, 30 Mar 2016 10:37:30 +0100 Subject: [PATCH] [MIPS] Update inline asm for R6 and add R6 as subtarget --- arch/mips/atomic_arch.h | 17 +++-------------- arch/mips/pthread_arch.h | 8 +------- arch/mips64/atomic_arch.h | 12 +++++------- arch/mips64/pthread_arch.h | 7 +------ configure | 2 ++ 5 files changed, 12 insertions(+), 34 deletions(-) diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h index ce2823b..4dbe4bb 100644 --- a/arch/mips/atomic_arch.h +++ b/arch/mips/atomic_arch.h @@ -3,10 +3,8 @@ static inline int a_ll(volatile int *p) { int v; __asm__ __volatile__ ( - ".set push ; .set mips2\n\t" "ll %0, %1" - "\n\t.set pop" - : "=r"(v) : "m"(*p)); + : "=r"(v) : "ZC"(*p)); return v; } @@ -15,24 +13,15 @@ static inline int a_sc(volatile int *p, int v) { int r; __asm__ __volatile__ ( - ".set push ; .set mips2\n\t" "sc %0, %1" - "\n\t.set pop" - : "=r"(r), "=m"(*p) : "0"(v) : "memory"); + : "=r"(r), "=ZC"(*p) : "0"(v) : "memory"); return r; } #define a_barrier a_barrier static inline void a_barrier() { - /* mips2 sync, but using too many directives causes - * gcc not to inline it, so encode with .long instead. */ - __asm__ __volatile__ (".long 0xf" : : : "memory"); -#if 0 - __asm__ __volatile__ ( - ".set push ; .set mips2 ; sync ; .set pop" - : : : "memory"); -#endif + __asm__ __volatile__ ("sync" : : : "memory"); } #define a_pre_llsc a_barrier diff --git a/arch/mips/pthread_arch.h b/arch/mips/pthread_arch.h index 8a49965..d8b6955 100644 --- a/arch/mips/pthread_arch.h +++ b/arch/mips/pthread_arch.h @@ -1,13 +1,7 @@ static inline struct pthread *__pthread_self() { -#ifdef __clang__ - char *tp; - __asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" ); -#else register char *tp __asm__("$3"); - /* rdhwr $3,$29 */ - __asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) ); -#endif + __asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp)); return (pthread_t)(tp - 0x7000 - sizeof(struct pthread)); } diff --git a/arch/mips64/atomic_arch.h b/arch/mips64/atomic_arch.h index b468fd9..ac92891 100644 --- a/arch/mips64/atomic_arch.h +++ b/arch/mips64/atomic_arch.h @@ -4,7 +4,7 @@ static inline int a_ll(volatile int *p) int v; __asm__ __volatile__ ( "ll %0, %1" - : "=r"(v) : "m"(*p)); + : "=r"(v) : "ZC"(*p)); return v; } @@ -14,7 +14,7 @@ static inline int a_sc(volatile int *p, int v) int r; __asm__ __volatile__ ( "sc %0, %1" - : "=r"(r), "=m"(*p) : "0"(v) : "memory"); + : "=r"(r), "=ZC"(*p) : "0"(v) : "memory"); return r; } @@ -24,7 +24,7 @@ static inline void *a_ll_p(volatile void *p) void *v; __asm__ __volatile__ ( "lld %0, %1" - : "=r"(v) : "m"(*(void *volatile *)p)); + : "=r"(v) : "ZC"(*(void *volatile *)p)); return v; } @@ -34,16 +34,14 @@ static inline int a_sc_p(volatile void *p, void *v) long r; __asm__ __volatile__ ( "scd %0, %1" - : "=r"(r), "=m"(*(void *volatile *)p) : "0"(v) : "memory"); + : "=r"(r), "=ZC"(*(void *volatile *)p) : "0"(v) : "memory"); return r; } #define a_barrier a_barrier static inline void a_barrier() { - /* mips2 sync, but using too many directives causes - * gcc not to inline it, so encode with .long instead. */ - __asm__ __volatile__ (".long 0xf" : : : "memory"); + __asm__ __volatile__ ("sync" : : : "memory"); } #define a_pre_llsc a_barrier diff --git a/arch/mips64/pthread_arch.h b/arch/mips64/pthread_arch.h index b42edbe..d8b6955 100644 --- a/arch/mips64/pthread_arch.h +++ b/arch/mips64/pthread_arch.h @@ -1,12 +1,7 @@ static inline struct pthread *__pthread_self() { -#ifdef __clang__ - char *tp; - __asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" ); -#else register char *tp __asm__("$3"); - __asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) ); -#endif + __asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp)); return (pthread_t)(tp - 0x7000 - sizeof(struct pthread)); } diff --git a/configure b/configure index 213a825..969671d 100755 --- a/configure +++ b/configure @@ -612,11 +612,13 @@ trycppif __AARCH64EB__ "$t" && SUBARCH=${SUBARCH}_be fi if test "$ARCH" = "mips" ; then +trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6 trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf fi if test "$ARCH" = "mips64" ; then +trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6 trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf fi -- 2.1.4 Thanks, Jaydeep
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.