|
Message-ID: <c4380a76ebaf2d8bb8e88ca5893a6b65@smtp.hushmail.com> Date: Tue, 26 Jun 2012 02:25:30 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: OpenCL kernel max running time vs. "ASIC hang" On 2012-06-26 02:02, Bit Weasil wrote: >> (...) We may invoke the kernel more than once from one crypt_all() >> call, sequentially. For example, the 256k may be achieved by 256 >> invocations of a kernel doing 1k iterations. This would bring the >> 9 seconds down to 35 ms per kernel invocation. Perhaps the >> intermediate results can even stay in the GPU between those >> invocations. >> > > This is what I do with my rainbow table generation (which is, in > many cases, functionally the same as a "slow" kernel). I take an > initial password, hash/reduce it many times (say, 200 000 for my > current tables), and store the end result. I do this with tunable > kernel execution times (this is the task that was getting ASIC hangs > until I adjusted it down). > > I simply store the intermediate values in the GPU global memory. > The access (if done sanely) is coalesced, and is roughly speaking a > "best case" memory access pattern for both the load and the store. > I'm using a high resolution timer class to dynamically adjust the > work done per kernel invocation. If I'm below 90% or above 110% of > my target time, I adjust the steps per invocation for the next call. > It seems to work nicely, and also properly handles conditions like an > overheating GPU that throttles, or someone gaming in the background. You make it sound very easy :) > It shouldn't be difficult to take a single execution kernel and break > it into multiple steps. If you would like a starting point, the > Cryptohaze tools have this done for all the GPU kernels - feel free > to take a look around. Thanks, I will do that! magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.