john-users - Re: GSOC - GPU for hashes

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110322014224.GA1134@openwall.com>
Date: Tue, 22 Mar 2011 04:42:24 +0300
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: GSOC - GPU for hashes

On Mon, Mar 21, 2011 at 03:17:21AM +0100, ?ukasz Odzioba wrote:
> About fast hashes, I suppose that the only way to do it right is try
> to implement both versions (in multiple variants), make benchmarks and
> then choose the most appropriate one.

Some hashes are known to be so fast that it's obvious that having
candidate passwords generated on the CPU will create a major bottleneck,
up to the point where delegating hashing to the GPU provides little
speedup (if at all).  So for those the choice will be obvious before we
start.  Yet in terms of development and usability it does make sense to
implement things the easy way first, then proceed from there.  In fact,
folks already did it (with OpenCL) for a few fast hashes:

http://openwall.info/wiki/john/patches

see "OpenCL support for cracking NT, raw-MD4, raw-MD5 and raw-SHA1
hashes" by Alain Espinosa, Dhiru Kholia, Samuele Tonon.

> I found FPGA hash crackers on
> the Internet. From architectural point of view it might be good idea
> to use similar solutions.

Theoretically, yes, but I am unaware of a good one (from this point).

One of the advantages of JtR is that it generates candidate passwords in
a nearly optimal order (more likely passwords first, based on available
statistics).  As soon as we offload some of that to the GPU, we either
make the order in which candidate passwords are tried less optimal
(e.g., the last 2 characters always cover the entire ASCII range, even
though many uncommon character combinations would be better tried at a
later time) or we do complicated and not-so-fast processing on the GPU
(besides implementation complexity, we might introduce a new bottleneck
for hashes that are truly fast rather than just medium-speed).

"Competing" tools (and timings) to look at are:
http://hashcat.net/oclhashcat+/
http://hashcat.net/oclhashcat/

Of course, we must not reuse any code from there (nor from anywhere
else, unless appropriately licensed).

The timings for oclHashcat+ suggest that their on-GPU rule engine is
fast enough that it is not a bottleneck.  We should be able to achieve
something similar, and for JtR's incremental mode as well.

Also, they show that "md5crypt $1$" and "phpass $P$9" are slow enough
that it is acceptable not to move the rule engine onto the GPU for
those, unless there are many fast GPUs per CPU.  Of course, further
speedup is achieved with such a move, so it would be desirable to make
eventually, but it's not required for the first implementation that
would be of practical use.  Thus, we can treat these hashes as "slow".

> In Poland summer break starts in July, and before that we've got egzams.
> Q5) Will it be a problem to distribute work in 33:67 properties rather
> than 50:50?

I'd prefer 67:33 (and I think Google prefers this too), especially given
that our major opportunity to test this stuff is on July 29-31 (and we
need some users already familiar with the JtR/GPU stuff before then):
http://contest.korelogic.com

However, our mentoring capacity may be the bottleneck anyway, so we may
use this as an opportunity to dedicate more time to other students
before July.

> Q6) Is there any JtR code documentation(I saw user doc)? It would be
> helpful to get familiar with sources.

JtR's internal interfaces are documented in the .h files.  So if you
see a function in some .c file that is not "static", then it is exported
for use by other source files and there should be a comment on it in the
.h file.  Besides comments on the interfaces, there are also some
comments throughout the code - these are generally for things that I
thought were especially non-obvious.  Other than that, I'm afraid that
there's no documentation.

If/when you have specific questions, please ask on john-dev.  I might
respond by adding source code comments specifically on things that you
found non-obvious, which will help others as well.

Similarly, when you find something non-obvious but then figure it out on
your own, it'd be very nice of you to propose source code comments to
add (maybe submit these as patches: post them to john-dev).  If you're
good at creating programmers' documentation, we'd appreciate it if you
start writing some, maybe on our wiki: register for an account and
create a page like
http://openwall.info/wiki/john/development/program-structure (it does
not exist yet), or whatever would be the appropriate name for the
content you'd be adding.  Oh, there's already this wiki page:

http://openwall.info/wiki/john/development

which currently has little info (and I don't recommend using an IDE),
yet it's where you'd add link(s) to your new sub-page(s) to... or you
may start by adding your content right to the "development" page, to be
moved to sub-page(s) when there's enough content for that.  That's up
to you.

> Q7) I found yours application template and it's the only thing that
> "detailed description" must contain, can I add something to it?

Of course.  Please do add other relevant info you might have.  We'd
appreciate it!

> Q8) Today i am going to library for Bruce Schneier's "Applied
> Cryptography". Are there any good books related to the topic, which I
> should look for?

As Samuele has correctly pointed out, this task is not so much about
cryptography as it is about optimization of algorithms and code.
I think you'll "learn this" from reading of existing source code, from
your own attempts at producing efficient implementations, and from
discussions on john-dev (you'll need to bring topics up and ask
questions).

Thanks,

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.