|
Message-ID: <20130302062102.GP20323@brightrain.aerifal.cx> Date: Sat, 2 Mar 2013 01:21:02 -0500 From: Rich Felker <dalias@...ifal.cx> To: musl@...ts.openwall.com Subject: Re: ARM optimisations On Fri, Mar 01, 2013 at 10:33:19PM -0600, Rob Landley wrote: > On 02/28/2013 05:30:51 PM, Rich Felker wrote: > >On Fri, Mar 01, 2013 at 12:15:21PM +1300, Andre Renaud wrote: > >> Hi, > >> Can anyone tell me what the policy for musl is regarding ARM > >optimised > >> assembly implementations of functions such as memcpy/memmove? I > >notice > >> that there are i386/x86_64 versions for some of these. Doing some > >> simple testing on an ARM platform I found that an ARM asm > >> implementation of memcpy is ~80% faster than the C one currently in > >> MUSL (this is on an ARMv5, so no NEON instructions or similar). > >> > >> I don't think I'm capable of writing the optimised version entirely > >> myself, however there are various implementations floating around in > >> libraries such as bionic etc... Is it possible to have BSD licensed > >> code brought in to musl (which is MIT licensed)? > > > >ARM optimizations are welcome as long as they're thoroughly tested, > >not heavily bloated, and support all v4 (including no-thumb) and later > >cpu models, either by using universally-available features or > >conditioning use of features on the .hidden __hwcap provided in musl. > > Out of curiosity, why armv4 no thumb? > > I'd actually say that armv5 is probably the one to optimize for, > because it's somewhere over 80% of the installed base of arm systems > and generally provides an additonal 25% speedup from armv4 to armv5. > Anything lower than that can use C, anything newer than that can > benefit from an armv5 version vs C. > > The reason armv4t _without_ thumb isn't interesting is you need at > least armv4t to use EABI, and I had to patch my compiler to make This is a compiler bug. If the compiler can be made to generate proper return code, EABI works with armv4 (non-thumb) too. > Newer compilers have dropped support for OABI entirely, and armv4t OABI is not supported by musl at all. The intent is simply not to _preclude_ use of non-thumb, even though there are other obstacles to its use now. > systems aren't that common. (They existed, the tin can tools nail > board used one, but the generic C code works for them. Point is I'm > not sure they're worth _optimizing_ for if it costs the vast > majority of systems a 25% performance hit and we don't want to > maintain multiple versions. If you _have_ an armv5 version, the > armv4 one won't/shouldn't get much testing.) Can you explain why you think a version that's v4 compatible will be that much slower? If so, v5 code can be used as long as it checks __hwcap and falls back to a simple working version... Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.