Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABob6iqU5uP8h2qctdp-2NvUG++yGUoGfoBY7Uf=9GaZP2=WsQ@mail.gmail.com>
Date: Sat, 11 Aug 2012 22:53:05 +0200
From: Lukas Odzioba <lukas.odzioba@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: pwsafe-gpu

2012/8/11 magnum <john.magnum@...hmail.com>:
> On 2012-08-11 13:00, magnum wrote:
>> On 2012-08-11 02:38, Solar Designer wrote:
>>> Lukas -
>>>
>>> As discussed on IRC, here are my changes to pwsafe-cuda.  The most
>>> important one is initialization of w[14].  This makes the problem go
>>> away for me.  Please implement similar changes to pwsafe-opencl.
>>>
>>> magnum - we need to get both sets of changes into the -fixes branch,
>>> even though my patch happens to be against magnum-jumbo.
>>
>> Aye. This fixes all problems, though I will wait for Lukas patch before
>> committing. Same changes to pwsafe-opencl give a 40% boost on GTX570!
>> Though it's still a lot slower than the CUDA format despite being
>> identical code - I figure that's mostly caused by inoptimal LWS/GWS figures.
>
> It sure was:
>
> magnum@...l:src [1.7.9-jumbo-6-fixes]$ ../run/john -t -fo:pwsafe-cuda
> Benchmarking: Password Safe SHA-256 [CUDA]... DONE
> Raw:    109574 c/s real, 110276 c/s virtual
>
> magnum@...l:src [1.7.9-jumbo-6-fixes]$ ../run/john -t -fo:pwsafe-opencl
> OpenCL platform 0: NVIDIA CUDA, 1 device(s).
> Using device 0: GeForce GTX 570
> Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
> Raw:    37948 c/s real, 37726 c/s virtual
>
> -#define KEYS_PER_CRYPT         1024
> +#define KEYS_PER_CRYPT         512*112
>
> magnum@...l:src [1.7.9-jumbo-6-fixes]$ ../run/john -t -fo:pwsafe-opencl
> OpenCL platform 0: NVIDIA CUDA, 1 device(s).
> Using device 0: GeForce GTX 570
> Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
> Raw:    128862 c/s real, 128862 c/s virtual
>
> I just picked the number used in CUDA - I suppose it can be even better.
>
> magnum

I'll try to make it faster later, now we have more important formats
that needs tweaking.
Faster OpenCL code is nothing new for me (cl compiler does better job
here and dummy code is near always faster on OpenCL), after proper
optimizations they should have similar speed.
40% - you meant memset or w[14]=0 ?

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.