john-dev - Re: OpenCL optimization trick for md*

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20120821184543.GA10726@openwall.com>
Date: Tue, 21 Aug 2012 22:45:43 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: OpenCL optimization trick for md*

Hi Alain,

On Tue, Aug 21, 2012 at 09:51:52AM -0700, Alain Espinosa wrote:
> Hi. Share some small OpenCL optimization i will use in next Hash Suite.

Thank you!

> Work in nt_kernel.cl, md4_kernel.cl, md5_kernel.cl, sha1_kernel.cl and
> possible others.

These are deliberately not optimized at this time - no point in having
their crypto code run faster when it's bottlenecked by password
generation on CPU anyway.

In myrice's branch, md5_kernel.cl includes this kind of optimizations,
though, because that branch is about password generation on GPU
experiments.

> In nt_kernel.cl change all round 1 steps:
> a += (d ^ (b & (c ^ d)))  +  nt_buffer[4] ; a = rotate(a , 3u );
> by
> a += bitselect(d, c, b) +  nt_buffer[4] ; a = rotate(a , 3u );

Yeah.

> Work in nt_kernel.cl and md4_kernel.cl but not sure if it is an
> improvement. In nt_kernel.cl change all round 2 steps:
> a += ((b & (c | d)) | (c & d)) + nt_buffer[0] + SQRT_2; a = rotate(a , 3u );
> by
> a += bitselect(bitselect(b,c,d), bitselect(d,b,c), b) + nt_buffer[0] +
> SQRT_2; a = rotate(a , 3u );

Now this is curious.  It's similar to what Sayantan did for SHA-1's
round 3.  We should try it for XOP and see if it helps there.

Thanks again,

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.