Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130713192024.GA25369@openwall.com>
Date: Sat, 13 Jul 2013 23:20:24 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com,
	"Sc00bz64@...oo.com" <sc00bz64@...oo.com>
Subject: Re: Anyone want to benchmark AVX2 code for bcrypt

On Wed, Jun 26, 2013 at 09:09:27AM -0700, Sc00bz64@...oo.com wrote:
> Windows 8 x64 4770k at 4.1GHz
> 
> Single threaded performance:
> AVX2: 868.6 h/s
> Hashcat: 1170 h/s (hashcat-cli64.exe -m 3200 -a 3 -n 1 m3200.txt -1 ?l?u?d?s ?1?1?1?1?1?1?1)

4770K at stock clocks (I think it's 3.5 GHz base, 3.9 GHz max turbo),
Ubuntu 12.04.2, your original binary:

solar@...l:~/j/bcrypt/bcryptavx2$ ./bcryptbench64.orig
Tests PASSED
Benchmarking...
AVX2: 8*256 took 1.9838 sec (1032.4 h/s) 0
Normal: 1024 took 1.6849 sec (607.8 h/s) 0

My rebuild of it:

solar@...l:~/j/bcrypt/bcryptavx2$ ./bcryptbench64
Tests PASSED
Benchmarking...
AVX2: 8*128 took 1.0204 sec (1003.5 h/s) 0
Normal: 1024 took 1.6829 sec (608.5 h/s) 0
solar@...l:~/j/bcrypt/bcryptavx2$ gcc --version
gcc (GCC) 4.9.0 20130707 (experimental)

JtR using one core:

solar@...l:~/j/john-1.8.0/run$ ./john -te -form=bcrypt
Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE
Raw:    1110 c/s real, 1110 c/s virtual

... and in an OpenMP-enabled build:

solar@...l:~/j/john-1.8.0/run$ OMP_NUM_THREADS=1 ./john -te -form=bcrypt
Warning: OpenMP is disabled; a non-OpenMP build may be faster
Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE
Raw:    1104 c/s real, 1106 c/s virtual

JtR using all four cores:

solar@...l:~/j/john-1.8.0/run$ ./john -te -form=bcrypt
Will run 8 OpenMP threads
Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X2]... DONE
Raw:    6595 c/s real, 825 c/s virtual

> So not using AVX2 is faster.

Yes, but by very little.  I think we should be able to repair that by
tweaking the code.

I don't know why a bigger slowdown with AVX2 was seen on Windows.
Different compiler?

I also tried JtR bcrypt-opencl with both AMD's and Intel's OpenCL SDK on
this CPU.  AMD's produces 5120 c/s.  Intel's produces 2576 c/s (might be
using AVX2 and exceeding the cache).

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.