Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a87be4a4-0838-af88-cff1-37b7c5fecdb5@codeaurora.org>
Date: Wed, 22 Jun 2016 12:08:13 -0700
From: "Zhao, Weiming" <weimingz@...eaurora.org>
To: musl@...ts.openwall.com
Subject: Re: build musl for armv7m

Thanks for reviewing.

I add tests for ARMv7m for memcpy.

For atomics.s, I think the below are equivalent:

      ldr ip,1f  ==> assembler will computes the offset from current inst to the label

-    ldr ip,[pc,ip] ==> here the address to be loaded is current PC + ip
+    add ip,pc,ip  ==> here, the PC is the same as above
+    ldr ip,[ip]
But I'm not familiar with the CP15 issue you mentioned.
So, anyway, I skip the change for atomics.s in this patch.

Thanks,
Weiming


On 6/20/2016 12:58 PM, Rich Felker wrote:
> On Thu, Jun 16, 2016 at 11:34:28AM -0700, Zhao, Weiming wrote:
>> I tried to build for armv6m (cortex-m0) and I got other build issues
>> with .S and inline asms.
>>
>> Below are the changes for building armv7m:
> I thought I'd already replied to this but I don't see my reply so I'm
> doing it [again?] now.
>
>> diff --git a/src/setjmp/arm/longjmp.s b/src/setjmp/arm/longjmp.s
>> index e28d8f3..e9b9b32 100644
>> --- a/src/setjmp/arm/longjmp.s
>> +++ b/src/setjmp/arm/longjmp.s
>> @@ -8,7 +8,9 @@ longjmp:
>>       mov ip,r0
>>       movs r0,r1
>>       moveq r0,#1
>> -    ldmia ip!, {v1,v2,v3,v4,v5,v6,sl,fp,sp,lr}
>> +    ldmia ip!, {v1,v2,v3,v4,v5,v6,sl,fp}
>> +    ldr sp, [ip]!
>> +    ldr lr, [ip]!
> I think changes like this are ok. They could be conditional on
> __thumb__ if they hurt performance measurably on arm but I doubt it
> matters.
>
>> diff --git a/src/string/arm/memcpy_le.S b/src/string/arm/memcpy_le.S
>> index 4db4844..2517d15 100644
>> --- a/src/string/arm/memcpy_le.S
>> +++ b/src/string/arm/memcpy_le.S
>> @@ -241,7 +241,8 @@ non_congruent:
>>       beq     2f
>>       ldr     r5, [r1], #4
>>       sub     r2, r2, #4
>> -    orr     r4, r3, r5,             lsl lr
>> +    lsl     r4, r5, lr
>> +    orr     r4, r3, r4
>>       mov     r3, r5,                 lsr r12
>>       str     r4, [r0], #4
>>       cmp     r2, #4
> If this is in a hot path it may need to be conditional.
>
>> diff --git a/src/thread/arm/atomics.s b/src/thread/arm/atomics.s
>> index 673fc03..a4bd03a 100644
>> --- a/src/thread/arm/atomics.s
>> +++ b/src/thread/arm/atomics.s
>> @@ -6,7 +6,8 @@
>>   .type __a_barrier,%function
>>   __a_barrier:
>>       ldr ip,1f
>> -    ldr ip,[pc,ip]
>> +    add ip,pc,ip
>> +    ldr ip,[ip]
>>       add pc,pc,ip
>>   1:    .word __a_barrier_ptr-1b
>>   .global __a_barrier_dummy
> As far as I can tell, this does not work at all. The arithmetic on pc
> is assuming the particular offset between the instruction using pc and
> the following code as arm opcodes.
>
> There's also the matter of the cp15 register load in this file that
> doesn't exist on cortex-m. IMO the kernel (or bare-metal trap handler)
> needs to trap and emulate it.
>
> Rich

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation


View attachment "patch.diff" of type "text/plain" (1547 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.