|
Message-ID: <847cc72c0aa0c402c2399d010e00d772@smtp.hushmail.com> Date: Thu, 04 Jun 2015 16:39:08 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Parallel in OpenCL On 2015-06-04 16:14, Agnieszka Bielec wrote: > 2015-06-04 13:38 GMT+02:00 magnum <john.magnum@...hmail.com>: >> looks to me each call is 3*5*128 rounds of SHA512? > yes, 3*5*128 for each call parallel_kernel_loop() > >> Note these lines (after my patch): >> >> opencl_init_auto_setup(SEED, 3*5*128*1, split_events, >> warn, 4, self, create_clobj, release_clobj, BINARY_SIZE*3, 0); >> >> autotune_run(self, 3*5*128*1, 0, 1000); >> >> If you change the loop kernel to do only 128 rounds per call, you should >> change it accordingly for opencl_init_auto_setup() but not for >> autotune_run(). The latter is total, the former is how much you do per call. >> If you change to a test vector with another cost, change the *1 accordingly >> for both. > > is this necessary? I don't see any difference in performance and if I > want to change *1, code will be complicated like in pomelo The *1 is (cost of) test vector #0 so you'd only need to change it if you change the test vectors. But it'd mostly affect showing correct output with --verb=5. The rest is necessary because auto-tune need to know what it tunes. If you are in fact calling your loop kernel 15 times in crypt_all() but only once in crypt_all_benchmark() (this is not the case yet though), you need to inform auto-tune about it. The result would be 15 times faster auto-tune. That's how eg. wpapsk-opencl can auto-tune so quick with fair accuracy despite being a slow format. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.