john-dev - Re: Sayantan:Weekly Report #11

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120705050657.GB16484@openwall.com>
Date: Thu, 5 Jul 2012 09:06:57 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Sayantan:Weekly Report #11

On Thu, Jul 05, 2012 at 08:57:01AM +0530, SAYANTAN DATTA wrote:
> Achievements:
> 1. Successfully compiled test openCL programs from modified il.

Can you demonstrate this to us so that we better understand what exactly
you're doing and what it looks like?  Maybe create a wiki page like:

http://openwall.info/wiki/john/development/AMD-IL

(or suggest a more suitable name).  Include the tiny(?) test programs in
there, command-lines used to compile, sample output.  The style of that
wiki page can be similar to:

http://openwall.info/wiki/internal/gcc-local-build

> Integration
> of the binary generator into JtR is possible but I first need your
> permission as the generator requires libelf-devel as additional dependency.

Is there already a need to integrate the support for this into JtR?
In other words, what would your next steps be after such integration?
Do you know what specific IL-level changes you'd try?

> Priorities:
> 1. More focus on optimizing the opencl kernels.

OK.  Can you try to answer these questions? -

Which execution units and how many per CU does your bf_kernel.cl (as
released in 1.7.9-jumbo-6) use on 7970?

Does it use scatter/gather addressing?  Or only gather?  Or neither?

How much LDS does it actually use?  Does it use any other memory type(s)
within the inner loop?

Before you started with bf_kernel.cl, I estimated that we might be able
to use up to 1/4th of 7970's computing resources yet fit in LDS.  Is
this what your code is trying to use or do you limit it to less, and why?

I recall that you wrote somewhere that you were only able to use 32 KB
of LDS per CU, not the full 64 KB.  (However, I can't find this now.)
Do I recall correctly, and if so why is that?

Besides all of the above, your idea to use local and global memory at
once is a good one.  If we don't achieve much with other optimizations,
perhaps we'll achieve something like 7000 c/s on 7970 by adding some use
of global memory in parallel with the local memory uses currently in
your kernel.

Thanks,

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.