|
Message-ID: <CA+TsHUDJfuB=RnfsMdzpsv5Gai6GBLu6fLPOQgGKnqaUMhS2ig@mail.gmail.com>
Date: Mon, 9 Jul 2012 22:55:08 +0530
From: Sayantan Datta <std2048@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: bf_kernel.cl (was: Sayantan:Weekly Report #11)
Hi Alexander,
On Mon, Jul 9, 2012 at 1:02 PM, Sayantan Datta <std2048@...il.com> wrote:
> I'd try setting WORK_GROUP_SIZE to 12, keep the declaration of S_Buffer
>> at its current size - introduce some new macro for this, like
>> LDS_GROUP_SIZE, which we'd keep at 8 for 7970. If lid is <
>> LDS_GROUP_SIZE, then use the current code. If lid is >= LDS_GROUP_SIZE
>> (would be 8, 9, 10, or 11 under this example), then use new code that
>> would use global memory instead (just modify the supplied BF_current_S
>> directly?)
>
>
I think this is not a good idea because because all of the 12 work items in
a work group would be executed on a single SIMD and we might see a slowdown
instead of any speed up. Here's the reason:
1. Due to branching within work group the execution of two branches would
get serialized.
2. You still cannot increase the number of SIMD units used per CU because
each workgroup still eats 32KB LDS. So we are again limited 2 SIMD units
per CU.
Here's my plan:
Let us assume that each work-item using LDS consume T seconds time.
Also assume each work-item using Glbal Memory uses xT seconds. Where x is
unkown to be determined by experiments.
Keep work group size 8 as before.
Scheduling the work items like this:
[(16LDS work items + 16GDS work items)X 32 + (x-1)(16LDS items)X 32] X
MULTIPLIER
in this case there is no branching within a work group which means each
work group can execute independently on seperate SIMDs. Also all SIMD units
will be utilized even though occupation within the SIMDs still remains
halved. Depending upon x we would get the speedup :
Here's the calculation:
When using only LDS :speed= 512 hashes/T
When using both LDS and GDS : speed= {1024 + (x-1)512}/xT
speed up factor = both LDS and GDS/only LDS
= (x+1)/x
So for x=5 we have speed up factor = 1.2 or 20%
for x=10 speed up would be 10%
Regards,
Sayantan
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.