|
Message-ID: <039564f6d9caf9b5e322994e3385dba1@smtp.hushmail.com> Date: Fri, 29 May 2015 22:52:41 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Interleaving of intrinsics Solar, Here's a GitHub issue where we discuss interleaving and present some benchmarks: https://github.com/magnumripper/JohnTheRipper/issues/1217 I think you will have some educated thoughts about this; Here's part of our current SHA-1: #define SHA1_PARA_DO(x) for((x)=0;(x)<SIMD_PARA_SHA1;(x)++) #define SHA1_ROUND2(a,b,c,d,e,F,t) \ SHA1_PARA_DO(i) tmp3[i] = tmpR[i*16+(t&0xF)]; \ SHA1_EXPAND2(t+16) \ F(b,c,d) \ SHA1_PARA_DO(i) e[i] = vadd_epi32( e[i], tmp[i] ); \ SHA1_PARA_DO(i) tmp[i] = vroti_epi32(a[i], 5); \ SHA1_PARA_DO(i) e[i] = vadd_epi32( e[i], tmp[i] ); \ SHA1_PARA_DO(i) e[i] = vadd_epi32( e[i], cst ); \ SHA1_PARA_DO(i) e[i] = vadd_epi32( e[i], tmp3[i] ); \ SHA1_PARA_DO(i) b[i] = vroti_epi32(b[i], 30); And here's a similar part of SHA256: #define SHA256_STEP0(a,b,c,d,e,f,g,h,x,K) \ { \ SHA256_PARA_DO(i) \ { \ w = _w[i].w; \ tmp1[i] = vadd_epi32(h[i], S1(e[i])); \ tmp1[i] = vadd_epi32(tmp1[i], Ch(e[i],f[i],g[i])); \ tmp1[i] = vadd_epi32(tmp1[i], vset1_epi32(K)); \ tmp1[i] = vadd_epi32(tmp1[i], w[x]); \ tmp2[i] = vadd_epi32(S0(a[i]),Maj(a[i],b[i],c[i])); \ d[i] = vadd_epi32(tmp1[i], d[i]); \ h[i] = vadd_epi32(tmp1[i], tmp2[i]); \ } \ } This file is -O3 (from a pragma) so I guess both cases will be unrolled but there is obviously a big difference after just unrolling. Assuming a perfect optimizer it wouldn't matter but assuming a non-perfect one, is the former better? I'm guessing SHA-1 was written that way for a reason? magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.