john-dev - Re: OpenCL on OSX

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a27698c077af6c86bf67a492262a4046@smtp.hushmail.com>
Date: Thu, 30 Aug 2012 01:37:44 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: OpenCL on OSX

On 08/28/2012 02:18 AM, magnum wrote:
> I got hold of a Macbook with Kepler GPU so I did some OpenCL testing.
> That was depressing. Every single format fails on GPU (while most or
> all work fine on CPU). There are also tons of benign but noisy
> warnings about things like comparing integers of different signs.
> I'll fix that too but it's not the real problem.
>
> The symptoms are just like a few others already have reported:
>
> OpenCL platform 0: Apple, 2 device(s). Using device 1: GeForce GT
> 650M Compilation log: Error building kernel. Returned build code:
> -11. DEVICE_INFO=130 OpenCL error (CL_BUILD_PROGRAM_FAILURE) in file
> (common-opencl.c) at line (136) - (clBuildProgram failed.)

Some progress. After forcing discrete GPU (OSX bug), CUDA works fine (if 
running auto-switching GPU, it kernel panics). Also, Some OpenCL formats 
work now (I committed a load of minor fixes):


CUDA Device #0
	Name:                          GeForce GT 650M
	Compute capability:            sm_30
	Number of multiprocessors:     2
	Clock rate:                    878 Mhz
	Total global memory:           1.0 GB
	Total shared memory per block: 48.0 kB
	Total constant memory:         64.0 kB
	Kernel execution timeout:      Yes
	Concurrent copy and execution: Yes
	Warp size:                     32

Benchmarking: md5crypt [CUDA]... DONE
Raw:	157827 c/s real, 157827 c/s virtual

Benchmarking: M$ Cache Hash MD4 len(pass)=8, len(salt)=13 [CUDA]... DONE
Raw:	21665K c/s real, 21884K c/s virtual

Benchmarking: M$ Cache Hash 2 (DCC2) PBKDF2-HMAC-SHA-1 [CUDA]... DONE
Raw:	4843 c/s real, 4843 c/s virtual

Benchmarking: phpass MD5 ($P$9 lengths 1 to 15) [CUDA]... DONE
Raw:	158117 c/s real, 156582 c/s virtual

Benchmarking: Password Safe SHA-256 [CUDA]... DONE
Raw:	20928 c/s real, 20928 c/s virtual

Benchmarking: Raw SHA-224 [CUDA]... DONE
Raw:	25305K c/s real, 25305K c/s virtual

Benchmarking: Raw SHA-256 [CUDA]... DONE
Raw:	25057K c/s real, 25057K c/s virtual

Benchmarking: Raw SHA-512 [CUDA]... DONE
Raw:	9074K c/s real, 9074K c/s virtual

Benchmarking: sha256crypt (rounds=5000) [CUDA]... DONE
Raw:	2466 c/s real, 2443 c/s virtual

Benchmarking: sha512crypt (rounds=5000) [CUDA]... DONE
Raw:	2059 c/s real, 2059 c/s virtual

Benchmarking: WPA-PSK PBKDF2-HMAC-SHA-1 [CUDA]... DONE
Raw:	5973 c/s real, 5973 c/s virtual

Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts:	9766K c/s real, 9671K c/s virtual
Only one salt:	7349K c/s real, 7419K c/s virtual


Platform #0 name: Apple
Platform version: OpenCL 1.2 (Jun 20 2012 14:18:19)
	Device #0 name:		Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz
	Device vendor:		Intel
	Device type:		CPU (LE)
	Device version:		OpenCL 1.2
	Driver version:		1.1
	Global Memory:		8192 MB
	Global Memory Cache:	64 bytes
	Local Memory:		32 KB (Global)
	Max clock (MHz) :	2300
	Max Work Group Size:	1024
	Parallel compute cores:	8

	Device #1 name:		GeForce GT 650M
	Device vendor:		NVIDIA
	Device type:		GPU (LE)
	Device version:		OpenCL 1.1
	Driver version:		CLH 1.0
	Global Memory:		1024 MB
	Global Memory Cache:	0 bytes
	Local Memory:		48 KB (Local)
	Max clock (MHz) :	405
	Max Work Group Size:	1024
	Parallel compute cores:	2
	Stream processors:	16  (2 x 8)

That last line is incorrect. It should be 384 (2 x 192). Claudio's code 
does not work because Apple's nvidia framwork does not export all stuff 
that native nvidia do. Not sure how to fix.


OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Benchmarking: md5crypt [OpenCL]... DONE
Raw:	68266 c/s real, 7372K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 64, Global work size (GWS) 2097152
Benchmarking: MySQL 4.1 double-SHA-1 [OpenCL]... DONE
Many salts:	11870K c/s real, 83886K c/s virtual
Only one salt:	11983K c/s real, 89877K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Benchmarking: phpass MD5 ($P$9 length 8) [OpenCL]... DONE
Raw:	127746 c/s real, 4300K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
Raw:	15753 c/s real, 5734K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 128, Global work size (GWS) 2097152
Benchmarking: Raw MD4 [OpenCL]... DONE
Raw:	19972K c/s real, 69905K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 128, Global work size (GWS) 2097152
Benchmarking: Raw MD5 [OpenCL]... DONE
Raw:	39064K c/s real, 99614K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 64, Global work size (GWS) 2097152
Benchmarking: Raw SHA-1 OpenCL [OpenCL]... DONE
Raw:	23741K c/s real, 119837K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 1024, global work size (GWS) 2048
Benchmarking: sha256crypt (rounds=5000) [OpenCL]... DONE
Raw:	2482 c/s real, 409600 c/s virtual


Not bad for a laptop. The rest of the OpenCL formats do not yet work though.

magnum
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.