|
Message-ID: <CAJ86T=Wtvj3E9vGuxHGHmAoJU-0kSUqvL_hntt0zN2a8NT8sLw@mail.gmail.com> Date: Wed, 15 Jan 2020 10:41:08 -0800 From: Andre McCurdy <armccurdy@...il.com> To: musl@...ts.openwall.com Subject: Re: [PATCH 2/2] Add big-endian support to ARM assembler memcpy On Wed, Jan 15, 2020 at 7:46 AM Rich Felker <dalias@...c.org> wrote: > On Fri, Sep 13, 2019 at 01:38:34PM -0700, Andre McCurdy wrote: > > On Fri, Sep 13, 2019 at 11:59 AM Rich Felker <dalias@...c.org> wrote: > > > On Fri, Sep 13, 2019 at 11:44:32AM -0700, Andre McCurdy wrote: > > > > Allow the existing ARM assembler memcpy implementation to be used for > > > > both big and little endian targets. > > > > > > Nice. I don't want to merge this just before release, but as long as > > > it looks ok I should be able to review and merge it afterward. > > > > > > Note that I'd really like to replace this giant file with C using > > > inline asm just for the inner block copies and C for all the flow > > > control, but I don't mind merging this first as long as it's correct. > > > > Sounds good. I'll wait for your feedback after the upcoming release. > > Sorry this dropped off my radar. I'd like to merge at least the thumb > part since it's simple enough to review quickly and users have > actually complained about memcpy being slow on armv7 with -mthumb as > default. Interesting. I wonder what the reference was against which the musl C code was compared? From my own benchmarking I didn't find the musl assembler to be much faster than the C code. There are armv6 and maybe early armv7 CPUs where explicit prefetch instructions make a huge difference (much more so than C -vs- assembler). Did the users who complained about musl memcpy() compare against a memcpy() which uses prefetch? For armv7 using NEON may help, although the latest armv7 cores seem to perform very well with plain old C code too. There are lots of trade offs so it's impossible for a single implementation to be universally optimal. The "arm-mem" routines used on Raspberry Pi seem to be a very fast for many targets, but unfortunately the armv6 memcpy generates mis-aligned accesses so isn't suitable for armv5. https://github.com/bavison/arm-mem/
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.