|
Message-ID: <80e9a68f85b7c0cfa235430b20adedcc@smtp.hushmail.com> Date: Thu, 20 Dec 2012 02:03:03 +0100 From: magnum <john.magnum@...hmail.com> To: "john-dev@...ts.openwall.com" <john-dev@...ts.openwall.com> Subject: Varous experimental OpenCL commits Claudio, Sayantan, all I have committed a couple patches that are somewhat experimental: 1. A patch that adds an "opencl_process_event()" function in common-opencl.c, that can be called from within an iterated format's split-kernel loop. This makes for swift response to key presses as well as proper session-saving in time, as discussed in another thread. The "pseudo-code patch" now is this simple for any format: void crypt_all(int count) { enqueue(Transfer); enqueue(RarInitKernel); for (i=0; i<HASH_LOOPS; i++) { enqueue(RarLoopKernel); + clFinish(); + opencl_process_event(); } enqueue(RarFinalKernel); After some glitches that was fixed, it seems to work just fine and only introduces a very minor performance drop. All my iterated formats now use this but I did not touch Claudio's nor Sayantan's formats - I leave it up to you to use it or not. 2. A patch that make use of our beloved "spinning wheel" progression indicator also during format self-test. This makes for a way to see when a session goes from self-test to actual cracking. Hopefully after we make the self-tests faster, we can drop this. 3. Modifications to all my OpenCL formats so they actually use the 'count' argument passed to crypt_all() to decrease global worksize when possible. This has several good consequences: It makes Single mode work less bad (min_keys_per_crypt can be set to local worksize) and it speeds up self-test - often a lot! For example, Office 2007 benchmark took 1:45 before this patch, and just 25 seconds now. This is quite simple: You just need to take local_work_size into account so you end up with a multiple. For scalar formats, I just did this: static void crypt_all(int count) { + size_t crypt_gws = ((count + (local_work_size - 1)) / + local_work_size) * local_work_size; ...then replace all uses of global_work_size within the function to crypt_gws. Simple as that! Don't forget to set self->params.min_keys_per_crypt to local_work_size in init(). Actually, I set it to MAX(local_work_size, 8) because some CPU drivers will use a local_work_size of 1 and we don't want it that low. All and any comments are welcome. Enjoy, magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.