|
Message-ID: <CAKGDhHUzvUFV+WgfiLV6GUAddWZqv2v1N2KV2uSHR=e89yqSQg@mail.gmail.com> Date: Fri, 22 May 2015 14:22:48 +0200 From: Agnieszka Bielec <bielecagnieszka8@...il.com> To: john-dev@...ts.openwall.com Subject: Re: autotune_run problem 2015-05-22 2:56 GMT+02:00 magnum <john.magnum@...hmail.com>: > On 2015-05-22 02:00, Agnieszka Bielec wrote: >> >> hi, >> I've fixed one bug in my parallel-opencl but I have problem with >> autotune_run >> I discovered that for different set of arguments is running different >> algorithm to determine when tuning for gws is stopped >> If i make >> autotune_run(self, 1000, 0, 500); >> computed gws is optimal on my laptop and --dev=1 but not in --dev=5, >> it prints exceed for the optimal value and setting highest >> duration_time doesn't work > > > Did you try setting it to 1000 instead of 500? If that works better you > should implement a split kernel though. A full second duration is way to > long. thanks! I was only testing the last argument for autotune_run(self, 1, 1000, 100000); i don't know why with 1000 works so far for parallel but I discovered that pomelo for bigger costs (5, even not so big) can be faster 10 times than actually after setting the last argument to 100000 (x4 faster) and when I also modify 1.8 to 1.1 is x10 faster! (I must make speed tests again :<) I tested this on my laptop and for --cost=5:5,5:5 yesterday I was thinking that there are 2 various algorithms for determining when autotune_run stops but it seems that two of them checks if we exceeded limits it is possible to disable checking based on time? >> when my autotune_run call looks like: >> autotune_run(self, 1, 1000, 100000); >> the time when we stop computing is determined by: >> if (best_speed && speed < 1.8 * best_speed && >> max_run_time && run_time > max_run_time) { >> if (!optimal_gws) >> optimal_gws = num; >> >> if (options.verbosity > 3) >> fprintf(stderr, " - too slow\n"); >> break; >> } >> >> we stop computing new values for gws only when new speed isn't 1.8 >> faster than the previous >> >> and 1.8 is a wrong value for parallel, a change from 1.8 to 1.1 works >> good for --dev=1 and on my laptop but for --dev=5 it stops for >> unoptimal gws=4096. ------- >> it stops on 4096 because there is no difference in the speed for >> gws=4094 and 8192 >> for 32768 the speed is better > Perhaps you could try bumping your starting figure. You could make it start > at eg. 16384 or 32768 by changing the SEED macro. Setting it too high might > be too hard for weak devices though. This might break running on weak device > though - even if they are slower than CPU and utterly unusable, we should > still behave. I was thinking about making one more step forward
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.