john-users - Re: JtR OpenCL patch

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABob6io-aeq+=rZ2yc-+6kghE0eko6qiNdAMBCH-+R++7rfmCQ@mail.gmail.com>
Date: Fri, 20 Jan 2012 14:59:48 +0100
From: Lukas Odzioba <lukas.odzioba@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: JtR OpenCL patch

2012/1/20 Solar Designer <solar@...nwall.com>:
> Did you actually test on CPUs?  What OpenCL implementation did you use?
Yes I tested it on my i3 2100, about implementation I am not sure how
it works. I wasn't installing any Intel OpenCL specific software, just
AMD drivers.

> My understanding is that Intel's OpenCL implementation is currently only
> capable of using CPU cores, but not yet the GPU component present in
> Sandy Bridge chips.  On the bright side, it does not actually require
> Sandy Bridge, but also supports many older CPUs from Intel.
I do not know much about Intels OpenCL, but code is using CPU:

ukasz@...kstar$./john -test --format=cryptmd5-opencl -gpu=1
OpenCL Platforms: 1
OpenCL Platform: <<<AMD Accelerated Parallel Processing>>> 2
device(s), using device: <<<Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz>>>
Benchmarking: CRYPTMD5-OPENCL [MD5-based CRYPT]... DONE
Raw:    15620 c/s real, 3963 c/s virtual

ukasz@...kstar$./john -test --format=cryptmd5-opencl
OpenCL Platforms: 1
OpenCL Platform: <<<AMD Accelerated Parallel Processing>>> 2
device(s), using device: <<<Cypress>>>
Benchmarking: CRYPTMD5-OPENCL [MD5-based CRYPT]... DONE
Raw:    131657 c/s real, 131657 c/s virtual

>> * faster cryptmd5 and ssha formats
>
> Got any numbers you can post?  Is cryptmd5 as supported in your OpenCL
> patch still slower than it is with your CUDA patch?

ukasz@...kstar$./john -test --format=ssha-opencl
OpenCL Platforms: 1
OpenCL Platform: <<<AMD Accelerated Parallel Processing>>> 2
device(s), using device: <<<Cypress>>>
Max Group Work Size 256 Optimal Group work Size = 64
Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE
Many salts:     14115K c/s real, 14252K c/s virtual
Only one salt:  10573K c/s real, 10663K c/s virtual

Yes, OpenCL code is still much slower than CUDA. I am going to do
something about this, target on the next month is to beat CUDA. I've
spend some time in December trying to vectorize phpass code (using
vector data types -uint4), but because of some bug I could not finish
it. At the moment there is more literature about OpenCL in the
Internet than half year ago, and making OpenCL code  faster should be
easier.

Lukas

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.