Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20fcf4bac6343b006dd429cbf778f669@smtp.hushmail.com>
Date: Tue, 08 May 2012 01:13:37 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: Lukas - status report #3

On 05/07/2012 11:22 PM, Lukas Odzioba wrote:
> I've been working on opencl problems on Bull. Unfortunatelly I wasn't
> able to fix any of them. Magnum stated that they should be trivial,
> but somehow I couldn't make formats work as they should.

I didn't really intend to fix your problems but I noticed you never
implemented this: http://www.openwall.com/lists/john-dev/2012/04/10/4 so
I got curious now and tried it.


OpenCL platform 0: NVIDIA CUDA, 1 device(s).
Using device 0: GeForce GTX 570
Optimal Group work Size = 256
Benchmarking: PHPASS-OPENCL [PORTABLE-MD5]... DONE
Raw:    502690 c/s real, 601043 c/s virtual

OpenCL platform 0: NVIDIA CUDA, 1 device(s).
Using device 0: GeForce GTX 570
Optimal Group work Size = 32
Benchmarking: wpapsk-opencl [GPU - OpenCL]... DONE
Raw:    24094 c/s real, 24094 c/s virtual


Simple as that ;-)

For crypt-md5, it's a matter of compilers not being very verbose (or
rather, not telling you a dang thing). When I get weird problems like
this I use to try all compilers, nvidia, AMD and Intel. Usually one of
them (and usually just one, and you never know which) informs about a
problem but not this time. I have yet to succeed in building clcc on
bull, but on my laptop I got this:

$ clcc opencl/cryptmd5_kernel.cl output.ptx
Building...

:150:6: error: call to 'rotate' is ambiguous
        a = ROTATE_LEFT(AC1 + x[0], S11);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
:9:27: note: instantiated from:
#define ROTATE_LEFT(x, s) rotate(x,s)
                          ^~~~~~
<built-in>:2784:22: note: candidate function
int __OVERLOADABLE__ rotate(int, int);
                     ^
<built-in>:2785:23: note: candidate function
uint __OVERLOADABLE__ rotate(uint, uint);
                      ^
<built-in>:2780:23: note: candidate function
char __OVERLOADABLE__ rotate(char, char);
                      ^
<built-in>:2781:24: note: candidate function
uchar __OVERLOADABLE__ rotate(uchar, uchar);
                       ^

...and almost 10,000 similar lines. So armed with this knowledge it was
in fact trivial:

-#define ROTATE_LEFT(x, s) rotate(x,s)
+#define ROTATE_LEFT(x, s) rotate(x, (uint32_t)s)


OpenCL platform 0: NVIDIA CUDA, 1 device(s).
Using device 0: GeForce GTX 570
Max Group Work Size 960
Optimal Group work Size = 128
Benchmarking: CRYPTMD5-OPENCL [MD5-based CRYPT]... DONE
Raw:    653872 c/s real, 646647 c/s virtual


On the 7970 we get a nice ASIC hang as usual though :-/

magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.