|
Message-ID: <b9ba696a554e11ea3cbf732b33dff8ad@smtp.hushmail.com> Date: Fri, 22 May 2015 02:56:18 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: autotune_run problem On 2015-05-22 02:00, Agnieszka Bielec wrote: > hi, > I've fixed one bug in my parallel-opencl but I have problem with autotune_run > I discovered that for different set of arguments is running different > algorithm to determine when tuning for gws is stopped > If i make > autotune_run(self, 1000, 0, 500); As you probably noticed already, the last argument comes in two flavors: If 1000 or below, it's parsed as milliseconds and limits a single kernel invocation (intended for loop kernels). If above 1000 (but should not be more than 10000000000UL) it's counted as nanoseconds of *total* time for all kernels and loops (basically total time for crypt_all()). This syntax is awful but that's how it is currently. For looped kernels, a value of 200 is sane, (if not, something else should be tweaked, eg. number of iterations per call). For single-run do-it-all kernels, 10000000000UL is fine. > computed gws is optimal on my laptop and --dev=1 but not in --dev=5, > it prints exceed for the optimal value and setting highest > duration_time doesn't work Did you try setting it to 1000 instead of 500? If that works better you should implement a split kernel though. A full second duration is way to long. > when my autotune_run call looks like: > autotune_run(self, 1, 1000, 100000); > the time when we stop computing is determined by: > if (best_speed && speed < 1.8 * best_speed && > max_run_time && run_time > max_run_time) { > if (!optimal_gws) > optimal_gws = num; > > if (options.verbosity > 3) > fprintf(stderr, " - too slow\n"); > break; > } > > we stop computing new values for gws only when new speed isn't 1.8 > faster than the previous > > and 1.8 is a wrong value for parallel, a change from 1.8 to 1.1 works > good for --dev=1 and on my laptop but for --dev=5 it stops for > unoptimal gws=4096. It's very hard to make these functions good with all formats. Maybe we should introduce another parameter for that 1.1/1.8 figure. > it stops on 4096 because there is no difference in the speed for > gws=4094 and 8192 > for 32768 the speed is better Perhaps you could try bumping your starting figure. You could make it start at eg. 16384 or 32768 by changing the SEED macro. Setting it too high might be too hard for weak devices though. This might break running on weak device though - even if they are slower than CPU and utterly unusable, we should still behave. > any idea how I can set optimal gws also for --dev=5 ? > reults above might suggest that we can have some hashes not autotuned > properly but persons with better knowledge about autotune_run should > comment this You should run with -verb=5, this will illustrate better what happens. Please try two runs on -dev=5 with -verb=5 comparing using 1000 and 10000000000UL for autotune_run() and post the results. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.