|
Message-ID: <692e4c781efd06ec7db9214b5edaa3ed@smtp.hushmail.com> Date: Tue, 21 Feb 2012 20:49:06 +0100 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: OpenCL KPC and LWS [was: Recent github patches] On 02/20/2012 11:39 PM, Samuele Giovanni Tonon wrote: > On 02/20/12 20:38, magnum wrote: >> Anyway, I tried find_best_kpc and it picks very small numbers (like >> 69632) and end up a lot slower than just going with the default 2M. I >> also tried manually setting 4M and that worked fine and was faster than >> 2M. Maybe the find_best could be enhanced somehow. > > this is quite strange, find_best_kpc should get the faster KPC no matter > what, could you give me some details: > format used, GPU card, LWS and some output ? Here's some output. This is ssha on a GTX 280 but I saw similar issues with the 9600GT as well as with other formats. First, a default run: $ ../run/john -test -form:ssha-opencl OpenCL Platforms: 1 OpenCL Platform: <<<NVIDIA CUDA>>> 1 device(s), using device: <<<GeForce GTX 280>>> Compilation log: Max Group Work Size 512 Optimal local work size 32 (to avoid this test on next run do export LWS=32) Local work size (LWS) 32, Keys per crypt (KPC) 2097152 Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE Many salts: 54973K c/s real, 27486K c/s virtual Only one salt: 32896K c/s real, 16448K c/s virtual OK, so it picked LWS 32 and the default KPC is 2M. Then I ask for auto-tuning KPC: $ KPC=0 ../run/john -test -form:ssha-opencl OpenCL Platforms: 1 OpenCL Platform: <<<NVIDIA CUDA>>> 1 device(s), using device: <<<GeForce GTX 280>>> Compilation log: Max Group Work Size 512 Optimal local work size 32 (to avoid this test on next run do export LWS=32) Calculating best keys per crypt, this will take a while Optimal keys per crypt 98304 (to avoid this test on next run do export KPC=98304) Local work size (LWS) 32, Keys per crypt (KPC) 98304 Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE Many salts: 45907K c/s real, 23542K c/s virtual Only one salt: 25657K c/s real, 12958K c/s virtual It picks a very low number and performance drops. Now I try manually setting KPC to 1.5M instead: $ KPC=$((3<<19)) ../run/john -test -form:ssha-opencl OpenCL Platforms: 1 OpenCL Platform: <<<NVIDIA CUDA>>> 1 device(s), using device: <<<GeForce GTX 280>>> Compilation log: Max Group Work Size 512 Optimal local work size 64 (to avoid this test on next run do export LWS=64) Local work size (LWS) 64, Keys per crypt (KPC) 1572864 Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE Many salts: 59177K c/s real, 29155K c/s virtual Only one salt: 34603K c/s real, 17215K c/s virtual This is ~10% faster than the 2M above BUT note that LWS happened to end up as 64 this time. Running the exact same command a couple of times, I sometimes get the following instead: $ KPC=$((3<<19)) ../run/john -test -form:ssha-opencl OpenCL Platforms: 1 OpenCL Platform: <<<NVIDIA CUDA>>> 1 device(s), using device: <<<GeForce GTX 280>>> Compilation log: Max Group Work Size 512 Optimal local work size 32 (to avoid this test on next run do export LWS=32) Local work size (LWS) 32, Keys per crypt (KPC) 1572864 Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE Many salts: 53970K c/s real, 26853K c/s virtual Only one salt: 32955K c/s real, 16399K c/s virtual Here, LWS was 32 and performance was worse than with KPC=2M. So the main issue is that auto KPC does not pick a good number. The LWS fluctuations might be due to normal variations between runs. I should have recorded the figures for KPC=2M and LWS=64 but I missed that. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.