|
Message-ID: <20130203065408.GA24719@openwall.com> Date: Sun, 3 Feb 2013 10:54:08 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Proposed optimizations to pwsafe Brian, magnum - On Wed, Jan 30, 2013 at 05:46:01AM -0500, Brian Wallace wrote: > Device 1: Tahiti (AMD Radeon HD 7900 Series) > Local worksize (LWS) 64, Global worksize (GWS) 57344 > Benchmarking: Password Safe SHA-256 [OpenCL]... DONE > Raw: 472615 c/s real, 17203K c/s virtual Now getting: Device 1: Tahiti (AMD Radeon HD 7900 Series) Local worksize (LWS) 64, Global worksize (GWS) 1048576 Benchmarking: Password Safe SHA-256 [OpenCL]... DONE Raw: 498135 c/s real, 209715K c/s virtual > Benchmarking: Password Safe SHA-256 [CUDA]... DONE > Raw: 129590 c/s real, 128862 c/s virtual Device 0: GeForce GTX 570 Local worksize (LWS) 64, Global worksize (GWS) 131072 Benchmarking: Password Safe SHA-256 [OpenCL]... DONE Raw: 131510 c/s real, 131951 c/s virtual Benchmarking: Password Safe SHA-256 [CUDA]... DONE Raw: 129590 c/s real, 129590 c/s virtual We got useful test results from atom (thanks again!): http://pastebin.com/xCaeqBKY Most useful is the reminder that we need to use split kernel (OpenCL only, since only relevant for AMD GPUs/drivers): "- Had to use LWS=64 because LWS=256 created a Zombie and I was forced to reboot :(" (I guess this could actually be a random occurrence. The problem could also occur with LWS=64.) magnum - Brian is going to implement split kernel, please help him by answering any questions he might have, etc. Brian - basically, individual kernel invocations should be taking no more than 200ms, preferably much less. This means that with a large GWS, you need to be computing only a fraction of the 2048 iterations per kernel invocation. Please store intermediate results in global memory. Thanks all! Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.