|
Message-ID: <20130710210149.GG29800@brightrain.aerifal.cx> Date: Wed, 10 Jul 2013 17:01:49 -0400 From: Rich Felker <dalias@...ifal.cx> To: musl@...ts.openwall.com Subject: Re: Thinking about release On Thu, Jul 11, 2013 at 08:34:03AM +1200, Andre Renaud wrote: > >> What also might be worth testing is whether GCC can compete if you > >> just give it a naive loop (not the fancy pseudo-vectorized stuff > >> currently in musl) and good CFLAGS. I know on x86 I was able to beat > >> the fanciest asm strlen I could come up with simply by writing the > >> naive loop in C and unrolling it a lot. > > > > > > Duff's device! > > That was exactly my first idea too, but interestingly it turns out not > to have really added any performance improvement. Looking at the > assembler, with -O3, gcc does a pretty good job of unrolling as it is. For what it's worth, my testing showed the current memcpy code in musl and the naive "while (n--) *d++=*s++;" version performing near-identically at -O3, and both got about 20% faster with -funroll-all-loops. With -O2 or -Os, the naive version was about 5 times slower. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.