|
Message-ID: <20150316013529.GA27108@openwall.com> Date: Mon, 16 Mar 2015 04:35:30 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Xeon Phi On Thu, Mar 05, 2015 at 07:47:40PM +0300, Solar Designer wrote: > [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te=1 -form=descrypt > Will run 240 OpenMP threads > Benchmarking: descrypt, traditional crypt(3) [512/512]... DONE > Many salts: 80170K c/s real, 333233 c/s virtual > Only one salt: 7660K c/s real, 61684 c/s virtual > > (This is on our 5110P.) I've just improved this to: [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te=1 -form=descrypt Will run 240 OpenMP threads Benchmarking: descrypt, traditional crypt(3) [512/512]... DONE Many salts: 84811K c/s real, 352128 c/s virtual Only one salt: 7710K c/s real, 62106 c/s virtual [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te -form=descrypt Will run 240 OpenMP threads Benchmarking: descrypt, traditional crypt(3) [512/512]... DONE Many salts: 85040K c/s real, 354418 c/s virtual Only one salt: 7391K c/s real, 45700 c/s virtual [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te -form=descrypt Will run 240 OpenMP threads Benchmarking: descrypt, traditional crypt(3) [512/512]... DONE Many salts: 85209K c/s real, 354606 c/s virtual Only one salt: 7405K c/s real, 46525 c/s virtual [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te -form=descrypt Will run 240 OpenMP threads Benchmarking: descrypt, traditional crypt(3) [512/512]... DONE Many salts: 85040K c/s real, 354976 c/s virtual Only one salt: 7391K c/s real, 46985 c/s virtual by adding the -no-opt-prefetch option to CFLAGS in the linux-mic make target in the core tree. magnum, please get this into jumbo as well. The prefetch instructions appeared to hurt performance in this case. I've also tried the settings from -opt-prefetch=1 to -opt-prefetch=4, and all of them were slower than -no-opt-prefetch. Once I chose -no-opt-prefetch, I tried re-tuning DES_bs_cpt and DES_BS_EXPAND, but got inconclusive results, so I left them as-is. -no-opt-prefetch also improves bcrypt speed from: [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te -form=bcrypt Will run 240 OpenMP threads Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE Raw: 6285 c/s real, 26.2 c/s virtual [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te=1 -form=bcrypt Will run 240 OpenMP threads Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE Raw: 6280 c/s real, 26.2 c/s virtual to: [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te -form=bcrypt Will run 240 OpenMP threads Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE Raw: 6330 c/s real, 26.3 c/s virtual [user@...er-mic0 user]$ LD_LIBRARY_PATH=. ./john -te=1 -form=bcrypt Will run 240 OpenMP threads Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE Raw: 6339 c/s real, 26.3 c/s virtual > The icc-generated assembly code for DES_bs_b.c looks sane to me (about > as good as what we're used to seeing for AVX, or maybe even better since > there are 32 ZMM registers). I noticed lots of vmovaps and vmovapd in the generated code. I tried replacing them with vmovdqa32 or vmovdqa64 using "sed -i" and building from the modified .s file. This didn't affect performance. I also tried switching from the *_epi32 to the *_epi64 macros (which actually correspond to different instructions for bitwise ops where these are supposed to produce the same results). This also made no difference for performance. Not even when I consistently made all of the loads, stores, and the bitwise instructions operate on vectors with 64-bit elements. So I reverted those changes for now. (In fact, I never had a C source code level change to produce the vmovdqa32 or vmovdqa64 instructions - I guess it'd take explicit load intrinsics, but even then I'm not sure since the compiler appears to ignore our explicit _mm512_store_epi32() and produce vmovaps anyway. Luckily, this does not matter on current hardware.) Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.