|
Message-ID: <3a2190474fb7098bf2d1990b306fb6c3@smtp.hushmail.com> Date: Sun, 05 Apr 2015 20:01:47 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: New SIMD generations, code layout On 2015-04-03 23:18, magnum wrote: > For anyone not following the progress on GitHub, I just want to announce > that we currently have a topic branch named "avx2" that includes support > for AVX2 and MIC (and maybe other AVX-512 targets if they exist) but is > nowhere near stable (actually it fails self-test on most formats right > now but the problems are systematic and will be fixed soon). First mile stone: All raw formats work with AVX2. Benchmarking: Raw-MD4 [MD4 128/128 SSE4.1 12x]... DONE Raw: 55972K c/s real, 55972K c/s virtual Benchmarking: Raw-MD4 [MD4 256/256 AVX2 8x2]... DONE Raw: 77658K c/s real, 77658K c/s virtual Benchmarking: Raw-MD5 [MD5 128/128 SSE4.1 12x]... DONE Raw: 43813K c/s real, 43813K c/s virtual Benchmarking: Raw-MD5 [MD5 256/256 AVX2 8x2]... DONE Raw: 63387K c/s real, 63387K c/s virtual Benchmarking: Raw-SHA1 [SHA1 128/128 SSE4.1 8x]... DONE Raw: 23807K c/s real, 23807K c/s virtual Benchmarking: Raw-SHA1 [SHA1 256/256 AVX2 8x2]... DONE Raw: 41125K c/s real, 41125K c/s virtual Benchmarking: Raw-SHA224 [SHA224 128/128 SSE4.1 4x]... DONE Raw: 10773K c/s real, 10773K c/s virtual Benchmarking: Raw-SHA224 [SHA224 256/256 AVX2 8x]... DONE Raw: 20447K c/s real, 20447K c/s virtual Benchmarking: Raw-SHA256 [SHA256 128/128 SSE4.1 4x]... DONE Raw: 10510K c/s real, 10510K c/s virtual Benchmarking: Raw-SHA256 [SHA256 256/256 AVX2 8x]... DONE Raw: 19252K c/s real, 19252K c/s virtual Benchmarking: Raw-SHA384 [SHA384 128/128 SSE4.1 2x]... DONE Raw: 4306K c/s real, 4306K c/s virtual Benchmarking: Raw-SHA384 [SHA384 256/256 AVX2 4x]... DONE Raw: 8464K c/s real, 8464K c/s virtual Benchmarking: Raw-SHA512 [SHA512 128/128 SSE4.1 2x]... DONE Raw: 4259K c/s real, 4259K c/s virtual Benchmarking: Raw-SHA512 [SHA512 256/256 AVX2 4x]... DONE Raw: 8211K c/s real, 8211K c/s virtual Note that many things are suboptimal, eg. we run 2x interleaving across the board (except for SHA2 that doesn't yet support it) without caring if it does good or bad (for debugging index macros). Some iterated ones work too: Benchmarking: wpapsk, WPA/WPA2 PSK [PBKDF2-SHA1 128/128 SSE4.1 8x]... DONE Raw: 1679 c/s real, 1696 c/s virtual Benchmarking: wpapsk, WPA/WPA2 PSK [PBKDF2-SHA1 256/256 AVX2 8x2]... DONE Raw: 3280 c/s real, 3280 c/s virtual Benchmarking: PBKDF2-HMAC-SHA1 [PBKDF2-SHA1 8x SSE2]... DONE Speed for cost 1 (iteration count) of 1000 Raw: 13816 c/s real, 13816 c/s virtual Benchmarking: PBKDF2-HMAC-SHA1 [PBKDF2-SHA1 8x2 AVX2]... DONE Speed for cost 1 (iteration count) of 1000 Raw: 26384 c/s real, 26384 c/s virtual Benchmarking: PBKDF2-HMAC-SHA256, rounds=12000 [PBKDF2-SHA256 128/128 SSE4.1 4x]... DONE Speed for cost 1 (iteration count) of 12000 Raw: 480 c/s real, 480 c/s virtual Benchmarking: PBKDF2-HMAC-SHA256, rounds=12000 [PBKDF2-SHA256 256/256 AVX2 8x]... DONE Speed for cost 1 (iteration count) of 12000 Raw: 942 c/s real, 942 c/s virtual Benchmarking: PBKDF2-HMAC-SHA512, GRUB2 / OS X 10.8+ [PBKDF2-SHA512 128/128 SSE4.1 2x]... DONE Speed for cost 1 (iteration count) of 23923 and 37174 Raw: 72.5 c/s real, 72.5 c/s virtual Benchmarking: PBKDF2-HMAC-SHA512, GRUB2 / OS X 10.8+ [PBKDF2-SHA512 256/256 AVX2 4x]... DONE Speed for cost 1 (iteration count) of 23923 and 37174 Raw: 143 c/s real, 143 c/s virtual Benchmarking: sha1crypt, NetBSD's sha1crypt [PBKDF1-SHA1 8x SSE2]... DONE Speed for cost 1 (iteration count) of 64000 and 40000 Raw: 277 c/s real, 280 c/s virtual Benchmarking: sha1crypt, NetBSD's sha1crypt [PBKDF1-SHA1 8x2 AVX2]... DONE Speed for cost 1 (iteration count) of 64000 and 40000 Raw: 549 c/s real, 549 c/s virtual Making these work for MIC is merely about tweaks in pseudo_intrinsics.h, if that. Lei, could you give it a try? magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.