john-dev - Re: PHC: Argon2 on GPU

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKGDhHXmdHKgAU_6y-Omsa7qz4MjSHtmPMKnzp+Wiip-PoZf1Q@mail.gmail.com>
Date: Thu, 20 Aug 2015 23:29:08 +0200
From: Agnieszka Bielec <bielecagnieszka8@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Argon2 on GPU

2015-08-20 23:03 GMT+02:00 Solar Designer <solar@...nwall.com>:
> On Thu, Aug 20, 2015 at 10:40:16PM +0200, Agnieszka Bielec wrote:
>> 2015-08-20 22:34 GMT+02:00 Solar Designer <solar@...nwall.com>:
>> > On Thu, Aug 20, 2015 at 08:04:20PM +0200, Agnieszka Bielec wrote:
>> >> 2015-08-19 18:39 GMT+02:00 Solar Designer <solar@...nwall.com>:
>> >> > I think you may try working on ulong16 or ulong8 instead.  I expect
>> >> > ulong8 to match the current GPU hardware best, but OTOH ulong16 makes
>> >> > more parallelism apparent to the OpenCL compiler and allocates it to one
>> >> > work-item.  So please try both and see which works best.
>> >>
>> >> I created something using ulong8, it's almost not noticeable better
>> >> speed in my laptop but worse on super both cards, no idea if this is
>> >> what you wanted ( I think that not ), you can take a look on branch
>> >> vector8
>>
>> > You should also
>> > use the wider vector type for the global memory references and in the
>> > kernel parameter list.
>>
>> was even more slower (on super, both cards)
>
> Where is the code?  Slower now doesn't necessarily mean we're doing
> anything wrong - it might also mean we're not doing enough of it yet.

deleted, it wasn't much of effort anyway

>
> And how much slower was it?  Did you try re-tuning LWS and GWS?

nope

vector8

[a@...er run]$ GWS=1024 ./john --test --format=argon2d-opencl
Benchmarking: argon2d-opencl [Blake2 OpenCL]...
memory per hash : 1.50 MB
Device 0: Tahiti [AMD Radeon HD 7900 Series]
using different password for benchmarking
DONE
Speed for cost 1 (t) of 1, cost 2 (m) of 1536, cost 3 (l) of 1
Many salts:     2061 c/s real, 307200 c/s virtual
Only one salt:  2104 c/s real, 307200 c/s virtual

[a@...er run]$ GWS=1024 ./john --test --format=argon2d-opencl --dev=5
Benchmarking: argon2d-opencl [Blake2 OpenCL]...
memory per hash : 1.50 MB
Device 5: GeForce GTX TITAN
using different password for benchmarking
DONE
Speed for cost 1 (t) of 1, cost 2 (m) of 1536, cost 3 (l) of 1
Many salts:     4970 c/s real, 5019 c/s virtual
Only one salt:  5019 c/s real, 4970 c/s virtual

vector8 + ulong8 for copying and xoring

[a@...er run]$ GWS=1024 ./john --test --format=argon2d-opencl
Benchmarking: argon2d-opencl [Blake2 OpenCL]...
memory per hash : 1.50 MB
Device 0: Tahiti [AMD Radeon HD 7900 Series]
using different password for benchmarking
DONE
Speed for cost 1 (t) of 1, cost 2 (m) of 1536, cost 3 (l) of 1
Many salts:     1563 c/s real, 102400 c/s virtual
Only one salt:  1563 c/s real, 204800 c/s virtual

[a@...er run]$ GWS=1024 ./john --test --format=argon2d-opencl --dev=5
Benchmarking: argon2d-opencl [Blake2 OpenCL]...
memory per hash : 1.50 MB
Device 5: GeForce GTX TITAN
using different password for benchmarking
DONE
Speed for cost 1 (t) of 1, cost 2 (m) of 1536, cost 3 (l) of 1
Many salts:     4970 c/s real, 4923 c/s virtual
Only one salt:  4923 c/s real, 4970 c/s virtual
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.