|
Message-ID: <CANJ2NMNXp8rwKnUYNCnOR7TCR+gU_uKiWE7EtszE6MHOmdzhDw@mail.gmail.com>
Date: Tue, 3 Apr 2012 18:05:19 +0800
From: myrice <qqlddg@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: fast hashes on GPU
On Sat, Mar 31, 2012 at 3:08 PM, Solar Designer <solar@...nwall.com> wrote:
>
> I just took a look. You haven't yet implemented the keys_changed trick
> that I had proposed - you're sending the entire set of keys to GPU on
> every crypt_all() call, which you don't have to do. Please implement
> this one trick and re-benchmark the thing _before_ you possibly proceed
> with the salts optimization (which is a lot more complicated). We need
> to know which of the optimizations made what performance difference.
>
> Now, I have already implemented the keys_changed trick. When no key is
changed, the keys will remain on GPU and will not invoke cudaMemcpy
function. Next step, I will implement salts optimizations(the lengthy one).
Here are benchmarks:(I will put on my github later)
---------Before keys_changed trick-----------------------
Benchmarking: Mac OS X 10.7+ salted SHA-512 CUDA [64/64]... DONE
Many salts: 1080K c/s real, 1086K c/s virtual
Only one salt: 1056K c/s real, 1059K c/s virtual
---------After keys_changed trick--------------------------
Benchmarking: Mac OS X 10.7+ salted SHA-512 CUDA [64/64]... DONE
Many salts: 1134K c/s real, 1134K c/s virtual
Only one salt: 1092K c/s real, 1092K c/s virtual
As I expected, this doesn't give a lot performance. Observations from cuda
profiler also provide that cudaMemcpy occupies a little time during crack.
P.S. I fix the bug you mentioned. And I added #pragma unroll 64 and
modified PLAINTEXT_LENGTH. However, on my G9600M GS card, this doesn't give
me a lot of performance. :(
Thanks!
Dongdong Li
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.