|
Message-ID: <20120318022929.GA19762@openwall.com>
Date: Sun, 18 Mar 2012 06:29:29 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: XOP for MD5/MD4/SHA-1
magnum -
On Sun, Mar 18, 2012 at 05:27:53AM +0400, Solar Designer wrote:
> Note that I haven't modified MD4 and SHA-1 to actually use XOP yet ...
I did so now. I've attached a patch for all of: MD5, MD4, SHA-1.
> ... for raw MD5 para_2 was a lot better than para_3 (but the latter
> is better for MD5-crypt).
I was wrong about that - somehow I did not notice the even better speed
for MD5-crypt there. I got it now:
Benchmarking: FreeBSD MD5 [SSE2i 8x]... (8xOMP) DONE
Raw: 203013 c/s real, 25426 c/s virtual
Now this is significantly better than Core i7-2600, which IIRC only
gives under 160k on this test. (Both CPUs benchmarked at stock clocks.)
And here's what I am getting for the raw hashes. With -x86-64i (Intel
compiler's SSE2 code):
Benchmarking: Raw MD5 [SSE2i 12x]... DONE
Raw: 32896K c/s real, 32682K c/s virtual
Benchmarking: Raw MD4 [SSE2i 12x]... DONE
Raw: 37282K c/s real, 37282K c/s virtual
Benchmarking: Raw SHA-1 [SSE2i 8x]... DONE
Raw: 18236K c/s real, 18236K c/s virtual
With -x86-64 (gcc's SSE2 code):
Benchmarking: Raw MD5 [SSE2i 12x]... DONE
Raw: 24432K c/s real, 24197K c/s virtual
Benchmarking: Raw MD4 [SSE2i 12x]... DONE
Raw: 34473K c/s real, 34473K c/s virtual
Benchmarking: Raw SHA-1 [SSE2i 8x]... DONE
Raw: 17567K c/s real, 17567K c/s virtual
With -x86-64-avx:
Benchmarking: Raw MD5 [SSE2i 12x]... DONE
Raw: 23301K c/s real, 23087K c/s virtual
Benchmarking: Raw MD4 [SSE2i 12x]... DONE
Raw: 35444K c/s real, 35696K c/s virtual
Benchmarking: Raw SHA-1 [SSE2i 8x]... DONE
Raw: 19284K c/s real, 19284K c/s virtual
Finally, the improvement with -x86-64-xop (due to this patch):
Benchmarking: Raw MD5 [SSE2i 8x]... DONE
Raw: 32577K c/s real, 32577K c/s virtual
Benchmarking: Raw MD4 [SSE2i 8x]... DONE
Raw: 36872K c/s real, 36872K c/s virtual
Benchmarking: Raw SHA-1 [SSE2i 8x]... DONE
Raw: 23464K c/s real, 23464K c/s virtual
So raw MD5 and raw MD4 are similar to Intel compiler code's speed,
whereas raw SHA-1 is now 28% faster than Intel's and 21% faster than AVX.
Alexander
View attachment "xop.diff" of type "text/plain" (11690 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.