john-dev - Re: Proposed optimizations to pwsafe

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJpaVcT3OfRkK_Og8C-1fiQpx_Hkt6Xe7n3LzaqJ28Nd0feZ5w@mail.gmail.com>
Date: Mon, 28 Jan 2013 16:26:04 -0500
From: Brian Wallace <nightstrike9809@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Proposed optimizations to pwsafe

I'm going to try and replace ror with rotate calls, but it seems to require
some type conversions.  I'm doing a bit of reading up on OpenCL dev to fix
any issues and hopefully get more c/s.

On Mon, Jan 28, 2013 at 1:55 PM, magnum <john.magnum@...hmail.com> wrote:

> Brian,
>
> After your OpenCL patch I get these warnings from pwsafe-opencl:
>
> Build log: <program source>:282:36: warning: signed shift result
> (0x200000000) requires 35 bits to represent, but 'int' only has 32 bits
>                 w[14] = sigma1( w[12] ) + w[7] + sigma0( 256 );
>                                                  ^~~~~~~~~~~~~
> <program source>:21:21: note: expanded from macro 'sigma0'
> #define sigma0(x) ((ror(x,7))  ^ (ror(x,18)) ^ (x>>3))
>                     ^
> <program source>:16:33: note: expanded from macro 'ror'
> #define ror(x,n) ((x >> n) | (x << (32-n)))
>                               ~ ^  ~
> <program source>:615:35: warning: signed shift result (0x200000000)
> requires 35 bits to represent, but 'int' only has 32 bits
>         w[14] = sigma1( w[12] ) + w[7] + sigma0( 256 );
>                                          ^~~~~~~~~~~~~
> <program source>:21:21: note: expanded from macro 'sigma0'
> #define sigma0(x) ((ror(x,7))  ^ (ror(x,18)) ^ (x>>3))
>                     ^
> <program source>:16:33: note: expanded from macro 'ror'
> #define ror(x,n) ((x >> n) | (x << (32-n)))
>                               ~ ^  ~
>
>
> It passes self-test though. Even the Test Suite passes IIRC. So maybe this
> is harmless? But we should still get rid of the warnings.
>
> Note that in the bleeding branch, compiler warnings are always shown. In
> unstable, you need to -DREPORT_OPENCL_WARNINGS or -DDEBUG for them to show
> up (as long as there are only warnings).
>
> magnum
>
>
>
> On 28 Jan, 2013, at 2:09 , Brian Wallace <nightstrike9809@...il.com>
> wrote:
>
> When I applied the opencl optimization, I only saw minor improvements
> compared to the CUDA improvements.  I found that was kind of weird, because
> it was basically the same changes to the code.
>
> On Sun, Jan 27, 2013 at 7:58 PM, magnum <john.magnum@...hmail.com> wrote:
>
>> On 28 Jan, 2013, at 1:41 , Solar Designer <solar@...nwall.com> wrote:
>> > On Sun, Jan 27, 2013 at 07:22:19PM -0500, Brian Wallace wrote:
>> >> Ok, I'll do those changes.  I haven't done much cuda/ocl coding in the
>> >> past, so it might take me a short while to get up to speed on what
>> works
>> >> best, although I have a good background in C and hash cracking
>> >> optimization.  What kind of benchmarks are we getting on pwsafe-opencl
>> vs
>> >> hashcat.
>> >
>> > Apparently, hashcat's speed is ~500k on HD 7970.  hashkill is at ~480k:
>> >
>> > http://twitter.com/gat3way/status/294968226209726464/photo/1
>> >
>> > We're getting 355k:
>> >
>>
>> > (The match of OpenCL and CUDA speed is curious.  I did not tune THREADS
>> > and BLOCKS in cuda_pwsafe.h, and was compiling for the default of sm_10.
>> > Perhaps better speed is possible with some tuning.)
>>
>> The OpenCL format currently only auto-tunes local work-size (THREADS) so
>> it too runs at suboptimal conditions. The global work-size defauls to the
>> same figure the CUDA format use. It does support LWS/GWS environment
>> variables though:
>>
>> $ GWS=$((256*1024)) ../run/john -t -fo:pwsafe-opencl -plat=1
>> OpenCL platform 1: AMD Accelerated Parallel Processing, 2 device(s).
>> Device 0: Tahiti (AMD Radeon HD 7900 Series)
>> Local worksize (LWS) 64, Global worksize (GWS) 262144
>> Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
>> Raw:    362411 c/s real, 78643K c/s virtual
>>
>> No huge difference though.
>>
>> In bleeding, Claudio has added a shared function for tuning GWS. I
>> haven't had time to try it out yet.
>>
>> magnum
>>
>
>
>

Content of type "text/html" skipped
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.