|
Message-ID: <20120626005645.GA11346@openwall.com> Date: Tue, 26 Jun 2012 04:56:45 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: phpass OpenCL and CUDA Lukas - The reverted opencl/phpass_kernel.cl currently in magnum-jumbo looks a bit dirtier than the newer version did. Perhaps you can re-apply some minor changes to it, without going all the way for a vectorized implementation yet (so that you don't reintroduce whatever problem that had)? Specifically: 1. There's a function called "cuda_md5" - rename it. 2. Make use of rotate() and bitselect(). The speeds are now down to: OpenCL platform 1: AMD Accelerated Parallel Processing, 2 device(s). Using device 0: Tahiti Max local work size 256 Optimal local work size = 32 Benchmarking: phpass MD5 ($P$9 length 8) [OpenCL]... DONE Raw: 606441 c/s real, 2926K c/s virtual OpenCL platform 0: NVIDIA CUDA, 1 device(s). Using device 0: GeForce GTX 570 Compilation log: ptxas info : Compiling entry function 'phpass' for 'sm_20' ptxas info : Function properties for phpass 32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads ptxas info : Used 35 registers, 44 bytes cmem[0] Max local work size 896 Optimal local work size = 32 Benchmarking: phpass MD5 ($P$9 length 8) [OpenCL]... DONE Raw: 302400 c/s real, 300833 c/s virtual Previously, the speed on 7970 was about 1050K c/s. The CUDA code on the GTX 570 achieves: Benchmarking: phpass MD5 ($P$9 lengths 1 to 15) [CUDA]... DONE Raw: 510171 c/s real, 507581 c/s virtual (in a default build). IIRC, previously, this was 600k to 730k c/s depending on settings. Did you have to revert anything in the CUDA code, too? Are we releasing with these lower speeds? Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.