|
Message-ID: <CAPfzE3ZMGwEvs2n_4LCKzMv0FROS55_1N+HdBw7HgNhexgM+eA@mail.gmail.com> Date: Thu, 11 Jul 2013 16:04:21 +1200 From: Andre Renaud <andre@...ewatersys.com> To: Rich Felker <dalias@...ifal.cx> Cc: musl@...ts.openwall.com Subject: Re: Thinking about release Hi Rich, >> Rich - do you have any comments on whether either the C or assembler >> variants of memcpy might be suitable for inclusion in musl? > > I would say either might be, but it looks like if we want competitive > performance, some asm will be needed (either inline or full). My > leaning would be to go for something simpler than the asm you've been > experimenting with, but with same or better performance, if this is > possible. I realize the code is not that big as-is, in terms of binary > size, but it's big from an "understanding it" perspective and I don't > like big asm blobs that are hard for somebody to look at and say "oh > yeah, this is clearly right". > > Anyway, the big questions I'd still like to get answered before moving > forward is whether the cache line alignment has any benefit. I certainly appreciate the need for concise, well understood, easily readable code. I can't see any obvious reason why this shouldn't work, although the assembler as it stands makes pretty heavy use of all the registers, and I can't immediately see how to rework it to free up 2 more (I can free up 1 by dropping the attempted preload). Given my (lack of) skills with ARM assembler, I'm not sure I'll be able to look too deeply into either of these options, but I'll have a go at the inline ASM version to force 8*4byte loads to see if it improves things. Regards, Andre
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.