|
Message-ID: <4F4413BF.1030704@linuxasylum.net> Date: Tue, 21 Feb 2012 22:59:27 +0100 From: Samuele Giovanni Tonon <samu@...uxasylum.net> To: john-dev@...ts.openwall.com Subject: Re: OpenCL KPC and LWS [was: Recent github patches] On 02/21/12 20:49, magnum wrote: > On 02/20/2012 11:39 PM, Samuele Giovanni Tonon wrote: >> On 02/20/12 20:38, magnum wrote: >>> Anyway, I tried find_best_kpc and it picks very small numbers (like >>> 69632) and end up a lot slower than just going with the default 2M. I >>> also tried manually setting 4M and that worked fine and was faster than >>> 2M. Maybe the find_best could be enhanced somehow. >> >> this is quite strange, find_best_kpc should get the faster KPC no matter >> what, could you give me some details: >> format used, GPU card, LWS and some output ? > > Here's some output. This is ssha on a GTX 280 but I saw similar issues > with the 9600GT as well as with other formats. > > First, a default run: > > $ ../run/john -test -form:ssha-opencl > OpenCL Platforms: 1 > OpenCL Platform: <<<NVIDIA CUDA>>> 1 device(s), using device: <<<GeForce > GTX 280>>> > Compilation log: > > Max Group Work Size 512 Optimal local work size 32 > (to avoid this test on next run do export LWS=32) > Local work size (LWS) 32, Keys per crypt (KPC) 2097152 > Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE > Many salts: 54973K c/s real, 27486K c/s virtual > Only one salt: 32896K c/s real, 16448K c/s virtual > > OK, so it picked LWS 32 and the default KPC is 2M. Then I ask for > auto-tuning KPC: > > $ KPC=0 ../run/john -test -form:ssha-opencl > OpenCL Platforms: 1 > OpenCL Platform: <<<NVIDIA CUDA>>> 1 device(s), using device: <<<GeForce > GTX 280>>> > Compilation log: > > Max Group Work Size 512 Optimal local work size 32 > (to avoid this test on next run do export LWS=32) > Calculating best keys per crypt, this will take a while Optimal keys per > crypt 98304 > (to avoid this test on next run do export KPC=98304) > Local work size (LWS) 32, Keys per crypt (KPC) 98304 > Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE > Many salts: 45907K c/s real, 23542K c/s virtual > Only one salt: 25657K c/s real, 12958K c/s virtual > > It picks a very low number and performance drops. Now I try manually > setting KPC to 1.5M instead: > > $ KPC=$((3<<19)) ../run/john -test -form:ssha-opencl > OpenCL Platforms: 1 > OpenCL Platform: <<<NVIDIA CUDA>>> 1 device(s), using device: <<<GeForce > GTX 280>>> > Compilation log: > > Max Group Work Size 512 Optimal local work size 64 > (to avoid this test on next run do export LWS=64) > Local work size (LWS) 64, Keys per crypt (KPC) 1572864 > Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE > Many salts: 59177K c/s real, 29155K c/s virtual > Only one salt: 34603K c/s real, 17215K c/s virtual > > This is ~10% faster than the 2M above BUT note that LWS happened to end > up as 64 this time. Running the exact same command a couple of times, I > sometimes get the following instead: > > $ KPC=$((3<<19)) ../run/john -test -form:ssha-opencl > OpenCL Platforms: 1 > OpenCL Platform: <<<NVIDIA CUDA>>> 1 device(s), using device: <<<GeForce > GTX 280>>> > Compilation log: > > Max Group Work Size 512 Optimal local work size 32 > (to avoid this test on next run do export LWS=32) > Local work size (LWS) 32, Keys per crypt (KPC) 1572864 > Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]... DONE > Many salts: 53970K c/s real, 26853K c/s virtual > Only one salt: 32955K c/s real, 16399K c/s virtual > > Here, LWS was 32 and performance was worse than with KPC=2M. > > So the main issue is that auto KPC does not pick a good number. The LWS > fluctuations might be due to normal variations between runs. I should > have recorded the figures for KPC=2M and LWS=64 but I missed that. looks like a chicken-egg problem: when lws is tested i use the default kpc=2M, when LWS is up i use the best LWS i just detected; luksas already reported this kind of problem but i thought we were safe since LWS usually is rather obvious. i will make some changes to print lws times during a debug session so you can tell me what are the numbers behind those different LWS. thanks for the report! Cheers Samuele
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.