|
Message-ID: <CAMSMCxn6n5SYdUUCa8hvqD88EgyQUXc3XvgaQYj5r_aCgW-paw@mail.gmail.com>
Date: Wed, 10 Jul 2013 13:49:50 -0700
From: Nathan McSween <nwmcsween@...il.com>
To: musl@...ts.openwall.com
Subject: Re: Thinking about release
I would think the iterate-per-char-till-zero would take the most time, even
if GCC vectorized without SIMD it would still need to iterate to find the
zero in the word with the zero, current musl does this as well though.
On Jul 10, 2013 1:34 PM, "Andre Renaud" <andre@...ewatersys.com> wrote:
> >> What also might be worth testing is whether GCC can compete if you
> >> just give it a naive loop (not the fancy pseudo-vectorized stuff
> >> currently in musl) and good CFLAGS. I know on x86 I was able to beat
> >> the fanciest asm strlen I could come up with simply by writing the
> >> naive loop in C and unrolling it a lot.
> >
> >
> > Duff's device!
>
> That was exactly my first idea too, but interestingly it turns out not
> to have really added any performance improvement. Looking at the
> assembler, with -O3, gcc does a pretty good job of unrolling as it is.
>
> Regards,
> Andre
>
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.