Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150814181122.GB30798@openwall.com>
Date: Fri, 14 Aug 2015 21:11:22 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Argon2 on GPU

On Fri, Aug 14, 2015 at 08:01:31PM +0200, Agnieszka Bielec wrote:
> 2015-08-14 19:06 GMT+02:00 Solar Designer <solar@...nwall.com>:
> > On Fri, Aug 14, 2015 at 07:02:39PM +0200, Agnieszka Bielec wrote:
> >> ah, In this link is argon2d, it's faster than argon2i because t_cost
> >> for argon2d is equal to 1, 3 for argon2i
> >
> > Sure, but IIRC on other benchmarks you posted there was only a small
> > difference in performance for 2i at t=3 and 2d at t=1.  Also, this
> > doesn't explain the ~10x worse performance we're seeing for 2i now.
> 
> where do you see ~10x batter performance than now with the same costs?

Not the same, but I meant this:

http://www.openwall.com/lists/john-dev/2015/08/14/42

[a@...er run]$ ./john --test --format=argon2i-opencl --v=4
Benchmarking: argon2i-opencl [Blake2 OpenCL]...
memory per hash : 1.46 MB
Device 0: Tahiti [AMD Radeon HD 7900 Series]
Calculating best global worksize (GWS); max. 1s single kernel invocation.
gws:       256         387 c/s         387 rounds/s 659.830ms per crypt_all()!
gws:       512         720 c/s         720 rounds/s 710.817ms per crypt_all()+
gws:      1024        1305 c/s        1305 rounds/s 784.470ms per crypt_all()+
Local worksize (LWS) 64, global worksize (GWS) 1024
using different password for benchmarking
DONE
Speed for cost 1 (t) of 3, cost 2 (m) of 1500, cost 3 (l) of 1
Many salts:     389 c/s real, 102400 c/s virtual
Only one salt:  386 c/s real, 51200 c/s virtual

vs. this:

http://www.openwall.com/lists/john-dev/2015/08/12/11

[a@...er run]$ ./john --test --format=argon2d-opencl --v=4
Benchmarking: argon2d-opencl [Blake2 OpenCL]...
memory per hash : 1.46 MB
Device 0: Tahiti [AMD Radeon HD 7900 Series]
Calculating best global worksize (GWS); max. 1s single kernel invocation.
gws:       256         964 c/s         964 rounds/s 265.514ms per crypt_all()!
gws:       512        1878 c/s        1878 rounds/s 272.497ms per crypt_all()+
gws:      1024        3447 c/s        3447 rounds/s 297.022ms per crypt_all()+
Local worksize (LWS) 64, global worksize (GWS) 1024
using different password for benchmarking
DONE
Speed for cost 1 (t) of 1, cost 2 (m) of 1500, cost 3 (l) of 1
Many salts:     2925 c/s real, 307200 c/s virtual
Only one salt:  2898 c/s real, 307200 c/s virtual

It's 2i at t=3 vs. 2d at t=1.  I'd expect the former to be at most 3x
slower (because of higher t), and in practice less than that due to 2i's
predictable and coalescing-friendly access pattern.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.