Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150831072643.GA10898@openwall.com>
Date: Mon, 31 Aug 2015 10:26:43 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Argon2 on GPU

On Sun, Aug 30, 2015 at 11:56:32PM +0200, Agnieszka Bielec wrote:
> [a@...er run]$ LWS=64 GWS=2348 ./john --test --format=argon2d-opencl
> --v=4 --dev=5
> Benchmarking: argon2d-opencl [Blake2 OpenCL]...
> memory per hash : 1.50 MB
> Device 5: GeForce GTX TITAN
> Options used: -I ./kernels -cl-mad-enable -cl-nv-verbose -D__GPU__
> -DDEVICE_INFO=65554 -DDEV_VER_MAJOR=352 -DDEV_VER_MINOR=21
> -D_OPENCL_COMPILER -DBINARY_SIZE=256 -DSALT_SIZE=64
> -DPLAINTEXT_LENGTH=125
> Local worksize (LWS) 64, global worksize (GWS) 2304
> using different password for benchmarking
> DONE
> Speed for cost 1 (t) of 1, cost 2 (m) of 1536, cost 3 (l) of 1
> Many salts:     9124 c/s real, 9035 c/s virtual
> Only one salt:  9124 c/s real, 9124 c/s virtual

Your specified GWS gets rounded to 2304 (as you can see above), and
besides you can do slightly better:

[solar@...er run]$ LWS=48 GWS=2304 ./john -test -format=argon2d-opencl -v=4 -dev=5
Benchmarking: argon2d-opencl [Blake2 OpenCL]... 
memory per hash : 1.50 MB
Device 5: GeForce GTX TITAN
Options used: -I ./kernels -cl-mad-enable -cl-nv-verbose -D__GPU__ -DDEVICE_INFO=65554 -DDEV_VER_MAJOR=352 -DDEV_VER_MINOR=21 -D_OPENCL_COMPILER -DBINARY_SIZE=256 -DSALT_SIZE=64 -DPLAINTEXT_LENGTH=125
Local worksize (LWS) 48, global worksize (GWS) 2304
using different password for benchmarking
DONE
Speed for cost 1 (t) of 1, cost 2 (m) of 1536, cost 3 (l) of 1
Many salts:     9680 c/s real, 9600 c/s virtual
Only one salt:  9600 c/s real, 9600 c/s virtual

> I just discovered that argon2d which uses coalescing is faster on both
> TITAN, it's slower with coalescing on my 960m so I didn't tested this
> on super. I should. I was trying many things for some period of time
> but these changes in results are from that we tested argon more
> carefully. to turn on coalescing in argon2d it's only needed to change
> 4 lines but I will make #define USE_COALESCING 0/1 very soon

It's unexpected that coalescing for Argon2d is of so much help.  Please
confirm these speeds by actual cracking, with a wordlist containing no
duplicate lines.

> I need to retest Lyra and yescrypt using GWS not of power 2

Looks like we need a script to test many LWS/GWS combinations.  I am
running a simple script (testing 128 combinations) against Argon2d on
TITAN X now (and we'll probably need to test many more once we have
improved the code some further, since the current code is obviously
still far from optimized and thus might not be worth a lot of tuning and
benchmarking effort).

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.