Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130302062102.GP20323@brightrain.aerifal.cx>
Date: Sat, 2 Mar 2013 01:21:02 -0500
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: ARM optimisations

On Fri, Mar 01, 2013 at 10:33:19PM -0600, Rob Landley wrote:
> On 02/28/2013 05:30:51 PM, Rich Felker wrote:
> >On Fri, Mar 01, 2013 at 12:15:21PM +1300, Andre Renaud wrote:
> >> Hi,
> >> Can anyone tell me what the policy for musl is regarding ARM
> >optimised
> >> assembly implementations of functions such as memcpy/memmove? I
> >notice
> >> that there are i386/x86_64 versions for some of these. Doing some
> >> simple testing on an ARM platform I found that an ARM asm
> >> implementation of memcpy is ~80% faster than the C one currently in
> >> MUSL (this is on an ARMv5, so no NEON instructions or similar).
> >>
> >> I don't think I'm capable of writing the optimised version entirely
> >> myself, however there are various implementations floating around in
> >> libraries such as bionic etc... Is it possible to have BSD licensed
> >> code brought in to musl (which is MIT licensed)?
> >
> >ARM optimizations are welcome as long as they're thoroughly tested,
> >not heavily bloated, and support all v4 (including no-thumb) and later
> >cpu models, either by using universally-available features or
> >conditioning use of features on the .hidden __hwcap provided in musl.
> 
> Out of curiosity, why armv4 no thumb?
> 
> I'd actually say that armv5 is probably the one to optimize for,
> because it's somewhere over 80% of the installed base of arm systems
> and generally provides an additonal 25% speedup from armv4 to armv5.
> Anything lower than that can use C, anything newer than that can
> benefit from an armv5 version vs C.
> 
> The reason armv4t _without_ thumb isn't interesting is you need at
> least armv4t to use EABI, and I had to patch my compiler to make

This is a compiler bug. If the compiler can be made to generate proper
return code, EABI works with armv4 (non-thumb) too.

> Newer compilers have dropped support for OABI entirely, and armv4t

OABI is not supported by musl at all. The intent is simply not to
_preclude_ use of non-thumb, even though there are other obstacles to
its use now.

> systems aren't that common. (They existed, the tin can tools nail
> board used one, but the generic C code works for them. Point is I'm
> not sure they're worth _optimizing_ for if it costs the vast
> majority of systems a 25% performance hit and we don't want to
> maintain multiple versions. If you _have_ an armv5 version, the
> armv4 one won't/shouldn't get much testing.)

Can you explain why you think a version that's v4 compatible will be
that much slower? If so, v5 code can be used as long as it checks
__hwcap and falls back to a simple working version...

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.