|
|
Message-ID: <f91f1c14e344fedb90f047a0b5d620dc@smtp.hushmail.com>
Date: Tue, 06 Mar 2012 20:14:11 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: OpenCL KPC and LWS
On 02/21/2012 10:59 PM, Samuele Giovanni Tonon wrote:
>> So the main issue is that auto KPC does not pick a good number. The LWS
>> fluctuations might be due to normal variations between runs. I should
>> have recorded the figures for KPC=2M and LWS=64 but I missed that.
>
> looks like a chicken-egg problem: when lws is tested i use the default
> kpc=2M, when LWS is up i use the best LWS i just detected; luksas
> already reported this kind of problem but i thought we were safe since
> LWS usually is rather obvious.
I realise I have a lot to catch up from you guys but here are a couple
of things that seem to get good and FAST results on my gear, both GPU
and CPU:
Have you tried querying
clGetKernelWorkGroupInfo() for
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE? On the CPU's and GPU's I
have tried, it seems reliable (better than the current testing) and very
fast. It's introduced in OpenCL 1.1 so I added a fallback like this:
clEnqueueWriteBuffer(queue_prof, buffer_keys, CL_TRUE, 0,
(PLAINTEXT_LENGTH) * SSHA_NUM_KEYS, saved_plain, 0, NULL, NULL);
+ // This is OpenCL 1.1, we catch CL_INVALID_VALUE and use a fallback
+ ret_code = clGetKernelWorkGroupInfo (crypt_kernel, devices[gpu_id],
+ CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE,
+ sizeof(best_multiple), &best_multiple, NULL);
+
+ if (ret_code == CL_INVALID_VALUE) {
+ //printf("Can't get preferred LWS multiple, using 1\n");
+ best_multiple = 1;
+ } else {
+ HANDLE_CLERROR(ret_code, "Query preferred work group multiple");
+ //printf("preferred multiple: %zu\n", best_multiple);
+ }
+
// Find minimum time
- for (my_work_group = 1; (int) my_work_group <= (int) max_group_size;
+ for (my_work_group = best_multiple; (int) my_work_group <= (int)
max_group_size;
my_work_group *= 2) {
Also, I seem to get good and very fast results with this loop in KPC
enumeration:
for( num=local_work_size; num <= SSHA_NUM_KEYS ; num<<=1)
Is testing every 16K really of any use? I just see fluctuating numbers
and a super slow test.
magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.