|
Message-ID: <552A4589.6070308@mailbox.org> Date: Sun, 12 Apr 2015 12:14:33 +0200 From: Frank Dittrich <frank.dittrich@...lbox.org> To: john-dev@...ts.openwall.com Subject: 23% performance regression for brypt (Intel i5-4570 CPU) Solar, you included the bcrypt related changes of bleeding-jumbo commit https://github.com/magnumripper/JohnTheRipper/commit/f64b42fee9e368cd85cf546f08b694510824fea2 into core, but decided to not allow BF_X2 = 3 for AVX systems. (I guess this is because on well BF_X2 = 3 causes about 5% performance regression.) But for my system (64bit Linux, i5-4570 CPU), this causes a 23% performance regression. With latest bleeding-jumbo, I get Will run 4 OpenMP threads Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... (4xOMP) DONE Speed for cost 1 (iteration count) of 32 Raw: 3888 c/s real, 967 c/s virtual This is also what I get when I checkout master and enable OMP. With commit f64b42fee9e368cd85cf546f08b694510824fea2 or any other commit which uses BF_X2 = 3, I get Will run 4 OpenMP threads Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]... (4xOMP) DONE Speed for cost 1 (iteration count) of 32 Raw: 5040 c/s real, 1253 c/s virtual Similarly, a generic build (with OMP) for the latest master commit gives Will run 4 OpenMP threads Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE Raw: 3830 c/s real, 958 c/s virtual When I patch best.sh to also test BF_X2 = 3 if BF_X2 = 1 is better than BF_X2 = 0, I get Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64]... 1992 c/s real, 498 c/s virtual Compiling: Blowfish benchmark (scale) Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64]... 2066 c/s real, 516 c/s virtual Compiling: Blowfish benchmark (two hashes at a time) Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... 3830 c/s real, 958 c/s virtual Compiling: Blowfish benchmark (three hashes at a time) Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]... 4982 c/s real, 1245 c/s virtual So, I suggest you test BF_X2 = 3 for generic builds (if BF_X2 = 1 is better than BF_X2 = 0). May be you also reconsider allowing BF_X2 = 3 for AVX. Is there anything I can test, any more information you need to decide when BF_X2 should be 3 even for AVX, and when it shouldn't? Here's my patch to enhance generic: diff --git a/src/best.sh b/src/best.sh index 5e671b1..0192183 100755 --- a/src/best.sh +++ b/src/best.sh @@ -122,11 +122,21 @@ echo "Compiling: Blowfish benchmark (two hashes at a time)" $MAKE bench || exit 1 RES=`./bench 3` || exit 1 if [ $RES -gt $MAX ]; then + MAX=$RES BF_X2=1 + ./detect $DES_BEST $DES_COPY $DES_BS $MD5_X2 $MD5_IMM $BF_SCALE 3 > arch.h + rm -f $BF_DEPEND bench + echo "Compiling: Blowfish benchmark (three hashes at a time)" + $MAKE bench || exit 1 + RES=`./bench 3` || exit 1 + if [ $RES -gt $MAX ]; then + BF_X2=3 + fi else BF_X2=0 fi Frank
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.