Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPfzE3ZMGwEvs2n_4LCKzMv0FROS55_1N+HdBw7HgNhexgM+eA@mail.gmail.com>
Date: Thu, 11 Jul 2013 16:04:21 +1200
From: Andre Renaud <andre@...ewatersys.com>
To: Rich Felker <dalias@...ifal.cx>
Cc: musl@...ts.openwall.com
Subject: Re: Thinking about release

Hi Rich,
>> Rich - do you have any comments on whether either the C or assembler
>> variants of memcpy might be suitable for inclusion in musl?
>
> I would say either might be, but it looks like if we want competitive
> performance, some asm will be needed (either inline or full). My
> leaning would be to go for something simpler than the asm you've been
> experimenting with, but with same or better performance, if this is
> possible. I realize the code is not that big as-is, in terms of binary
> size, but it's big from an "understanding it" perspective and I don't
> like big asm blobs that are hard for somebody to look at and say "oh
> yeah, this is clearly right".
>
> Anyway, the big questions I'd still like to get answered before moving
> forward is whether the cache line alignment has any benefit.

I certainly appreciate the need for concise, well understood, easily
readable code.

I can't see any obvious reason why this shouldn't work, although the
assembler as it stands makes pretty heavy use of all the registers,
and I can't immediately see how to rework it to free up 2 more (I can
free up 1 by dropping the attempted preload). Given my (lack of)
skills with ARM assembler, I'm not sure I'll be able to look too
deeply into either of these options, but I'll have a go at the inline
ASM version to force 8*4byte loads to see if it improves things.

Regards,
Andre

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.