|
Message-ID: <CANJ2NMMoV1EWU-sqW1u5jeFZZ4ACcMRRJT2a5dWbi06Rqvm4HQ@mail.gmail.com>
Date: Thu, 29 Mar 2012 05:16:21 +0800
From: myrice <qqlddg@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: rawsha256.cu patch(using shared memory)
Hi,
Lukas, Solar
Thank you for help!
To Lukas:
> You used shared memory after ALL time consuming computations, totaly
> not good idea:)
> First of all you must decide will you apply for slow, or fast hashes.
> Those are DIFFERENT tasks with different needs.
I go through the code and find your comment "//use shared memory". I
suppose you mean that to use shared memory making the final output(i.e.
write to global memory) coalesce. Just as the code show, I first write it
to shared memory and then write to global memory. This contribute to the
performance gains. I just have a try on this. Now I know, in order to make
fast hash efficient, we have to do lots of works and next I will discuss
with Solar.
To Solar:
> That's nice, but this is still awfully slow. In fact, even the
> benchmarks we have on the wiki somehow show higher speeds, even though
> you have a faster card (GTX-580, right?)
I am sorry for lack my hardware details. GTX-580 is my lab's server. But
recently it becomes unstable :(
I tested this code on my laptop with GeForce 9600M GS card and P8600 CPU.
So the performance is slow.
> The formats interface bottleneck is somewhere above 50M c/s. Actually,
> --format=dummy shows it at around 130M c/s on Core i7-2600, which is
> what you said you use, but indeed interfacing to the GPU takes time.
> With Samuele's fast hash implementations in OpenCL and running on GPU,
> we're getting close to 50M c/s. So you also need to get close to that.
> This is a good thing for you to attempt.
> (And once you get there, you'd need to somehow demonstrate that your
> code would be even faster without the interface bottleneck - e.g., by
> starting to implement candidate password generation and hash comparison
> on GPU in whatever quick way you can for the demo.)
Okay, I will implement XSHA512 first. If I have time, I will make this.
However, I think If I implement candidate password generation and
comparison on GPU, there are lots of work to do. I have to go
through existing code on password generation(I guess they are mainly in
Crakc.c?) and subtitute it with cuda.
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.