|
Message-ID: <20060427220443.GA18501@openwall.com> Date: Fri, 28 Apr 2006 02:04:43 +0400 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Performance tuning Speaking of MMX vs. x86-64 SSE registers: On Thu, Apr 27, 2006 at 11:21:03PM +0200, sebastian.rother@...erlin.de wrote: > So how can 8 64Bit registers outperform 16 128Bit Registers?! It's not registers which "perform". There are x86/MMX or x86-64/SSE instructions which are translated into one or more micro-ops. Some of those micro-ops may have latencies of greater than 1 cycle. Both micro-op counts and their latencies might differ for micro-ops generated for x86/MMX vs. x86-64/SSE. That's the theory - to answer your question ("how can it be true"). However, I've based my brief analysis primarily on the actual benchmarks I had performed. According to those benchmarks, MMX bitwise ops deliver better performance per-bit than SSE ones do, despite SSE registers being twice wider, on Pentium 3 and on AMD processors - but SSE is actually somewhat faster than MMX per-bit on Pentium 4 processors. In other words, SSE instructions perform more than twice slower than MMX ones do on P3 and AMD, but less than twice slower on P4. Of course, this may change with future processors of either or both vendors. > Related to the Co-Processors: Sebastian, Frank - thank you for the links. I'll have a look a bit later and comment in here if appropriate. -- Alexander Peslyak <solar at openwall.com> GPG key ID: B35D3598 fp: 6429 0D7E F130 C13E C929 6447 73C3 A290 B35D 3598 http://www.openwall.com - bringing security into open computing environments
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.