|
Message-ID: <CA+TsHUBxF9pK3HmOh0eqq2O9P0=AV+O0XtZiBa2L552NEH5v8w@mail.gmail.com>
Date: Tue, 10 Jul 2012 20:13:22 +0530
From: Sayantan Datta <std2048@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: bf_kernel.cl
On Tue, Jul 10, 2012 at 11:41 AM, Sayantan Datta <std2048@...il.com> wrote:
>
>
> On Tue, Jul 10, 2012 at 11:32 AM, Solar Designer <solar@...nwall.com>wrote:
>
>> This also makes sense. Are you committing this change? I think it
>> makes the code simpler, although it needs 3 extra registers per bcrypt
>> instance. We should have plenty of spare registers since we're
>> under-utilizing the GPU anyway (assuming that this OpenCL code is being
>> run on a GPU).
>>
>
> Okay I'll do it now. Also I would start working on the global and local
> memory combined implentation today.
>
> Regards,
> Sayantan
>
> http://www.openwall.com/lists/john-dev/2012/05/14/1
I was looking at the IL generated on 7970 using LDS only. Each Encrypt
call has approximately 540 instruction at IL level. However according to
your previous estimates each Encrypt call has 16*16+4+5 = 275 rusling in an
estimated speed of 52K c/s. Since the number of instruction is doubled we
should expect at least half of your previous estimates say roughly 26K c/s.
But we are nowhere near that. I guess your previous estimates were based
on the fact that each instruction takes 1 clock cycle to execute, is it?
But it looks like not all instructions rquire same number of clock cycle on
gpu.
Sayantan
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.