|
|
Message-ID: <80e9a68f85b7c0cfa235430b20adedcc@smtp.hushmail.com>
Date: Thu, 20 Dec 2012 02:03:03 +0100
From: magnum <john.magnum@...hmail.com>
To: "john-dev@...ts.openwall.com" <john-dev@...ts.openwall.com>
Subject: Varous experimental OpenCL commits
Claudio, Sayantan, all
I have committed a couple patches that are somewhat experimental:
1. A patch that adds an "opencl_process_event()" function in common-opencl.c, that can be called from within an iterated format's split-kernel loop. This makes for swift response to key presses as well as proper session-saving in time, as discussed in another thread. The "pseudo-code patch" now is this simple for any format:
void crypt_all(int count)
{
enqueue(Transfer);
enqueue(RarInitKernel);
for (i=0; i<HASH_LOOPS; i++)
{
enqueue(RarLoopKernel);
+ clFinish();
+ opencl_process_event();
}
enqueue(RarFinalKernel);
After some glitches that was fixed, it seems to work just fine and only introduces a very minor performance drop. All my iterated formats now use this but I did not touch Claudio's nor Sayantan's formats - I leave it up to you to use it or not.
2. A patch that make use of our beloved "spinning wheel" progression indicator also during format self-test. This makes for a way to see when a session goes from self-test to actual cracking. Hopefully after we make the self-tests faster, we can drop this.
3. Modifications to all my OpenCL formats so they actually use the 'count' argument passed to crypt_all() to decrease global worksize when possible. This has several good consequences: It makes Single mode work less bad (min_keys_per_crypt can be set to local worksize) and it speeds up self-test - often a lot! For example, Office 2007 benchmark took 1:45 before this patch, and just 25 seconds now.
This is quite simple: You just need to take local_work_size into account so you end up with a multiple. For scalar formats, I just did this:
static void crypt_all(int count)
{
+ size_t crypt_gws = ((count + (local_work_size - 1)) /
+ local_work_size) * local_work_size;
...then replace all uses of global_work_size within the function to crypt_gws. Simple as that! Don't forget to set self->params.min_keys_per_crypt to local_work_size in init(). Actually, I set it to MAX(local_work_size, 8) because some CPU drivers will use a local_work_size of 1 and we don't want it that low.
All and any comments are welcome.
Enjoy,
magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.