|
Message-ID: <20130713192024.GA25369@openwall.com> Date: Sat, 13 Jul 2013 23:20:24 +0400 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com, "Sc00bz64@...oo.com" <sc00bz64@...oo.com> Subject: Re: Anyone want to benchmark AVX2 code for bcrypt On Wed, Jun 26, 2013 at 09:09:27AM -0700, Sc00bz64@...oo.com wrote: > Windows 8 x64 4770k at 4.1GHz > > Single threaded performance: > AVX2: 868.6 h/s > Hashcat: 1170 h/s (hashcat-cli64.exe -m 3200 -a 3 -n 1 m3200.txt -1 ?l?u?d?s ?1?1?1?1?1?1?1) 4770K at stock clocks (I think it's 3.5 GHz base, 3.9 GHz max turbo), Ubuntu 12.04.2, your original binary: solar@...l:~/j/bcrypt/bcryptavx2$ ./bcryptbench64.orig Tests PASSED Benchmarking... AVX2: 8*256 took 1.9838 sec (1032.4 h/s) 0 Normal: 1024 took 1.6849 sec (607.8 h/s) 0 My rebuild of it: solar@...l:~/j/bcrypt/bcryptavx2$ ./bcryptbench64 Tests PASSED Benchmarking... AVX2: 8*128 took 1.0204 sec (1003.5 h/s) 0 Normal: 1024 took 1.6829 sec (608.5 h/s) 0 solar@...l:~/j/bcrypt/bcryptavx2$ gcc --version gcc (GCC) 4.9.0 20130707 (experimental) JtR using one core: solar@...l:~/j/john-1.8.0/run$ ./john -te -form=bcrypt Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE Raw: 1110 c/s real, 1110 c/s virtual ... and in an OpenMP-enabled build: solar@...l:~/j/john-1.8.0/run$ OMP_NUM_THREADS=1 ./john -te -form=bcrypt Warning: OpenMP is disabled; a non-OpenMP build may be faster Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE Raw: 1104 c/s real, 1106 c/s virtual JtR using all four cores: solar@...l:~/j/john-1.8.0/run$ ./john -te -form=bcrypt Will run 8 OpenMP threads Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE Raw: 6595 c/s real, 825 c/s virtual > So not using AVX2 is faster. Yes, but by very little. I think we should be able to repair that by tweaking the code. I don't know why a bigger slowdown with AVX2 was seen on Windows. Different compiler? I also tried JtR bcrypt-opencl with both AMD's and Intel's OpenCL SDK on this CPU. AMD's produces 5120 c/s. Intel's produces 2576 c/s (might be using AVX2 and exceeding the cache). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.