Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120401110303.GA5374@openwall.com>
Date: Sun, 1 Apr 2012 15:03:03 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Cc: Per Thorsheim <per@...rsheim.net>, Sprengers.Martijn@...g.nl
Subject: MD5-crypt on Nvidia GPUs tips

Lukas, all -

Per Thorsheim tweeted this link:

http://2012.sharcs.org/slides/sprengers.pdf

While you've probably already implemented many of the same optimizations,
this is worth taking a look at.

FWIW, @hashcat's reply was:
"@thorsheim @Bitweasil @Openwall oclHashcat is deprecated since long
time. If you want to compete hashcat you have to beat oclHashcat-lite."

Martijn - you show some weird speed number (not even a number) for John
the Ripper.  I wonder where you got that one?  Well, my guess is that
it's a non-OpenMP build of non-jumbo, or/and maybe a non-optimal make
target.  We have plenty of faster speeds here:

http://openwall.info/wiki/john/benchmarks

I haven't added that one yet, but the fastest per CPU chip (for one
chip) I am getting is 203k c/s for FX-8120 (or 220k c/s with some
overclocking).  This is not an expensive CPU - it's about $200.
That's with 8 threads and with XOP instructions in use (the latter will
be in the next -jumbo - available on github, but not released for
end-users yet).  The fastest currently on the wiki (for already released
code) is 851k c/s for four CPU chips (very expensive ones, though).

The XOP code (trivial changes over what we previously had, actually):

http://www.openwall.com/lists/john-dev/2012/03/18/3

We also have GPU code for John the Ripper here:

http://openwall.info/wiki/john/GPU

It is slower than hashcat's, though.  (IIRC, I get around 630k c/s
without tuning on my GTX 570 1600 MHz.  To do: need to try tuning it for
the card.)

There doesn't appear to be a published MD5-crypt speed number for
oclHashcat-lite (does it even support that?), but for oclHashcat-plus it
is 1197k c/s for GTX 570 1600 MHz, 9914.3k c/s for 2x6990 (so about
2500k per GPU chip).  The latter is roughly 12.5 times faster than what
I currently get on CPU (non-overclocked) per-chip.  Granted, 7970 will
probably do a bit faster per-chip (it's just one chip there), but then
FX-8120 is 2-3 times cheaper than the 7970 card.  (A motherboard plus
FX-8120 cost me twice less than a 7970 did.)

Of course, GPUs are faster than CPUs at suitable tasks, but the speedup
of 25-30 times given on the slide may be an over-estimate for MD5-crypt
on current GPUs vs. current CPUs (per chip).

BTW, given that GTX 295 a dual-GPU card, it could be fairly compared
against _two_ CPU chips of the same age (e.g., dual quad-core Xeons).
2xE5420 2.5 GHz (a machine that is several years old now) gives 215k c/s
with john-1.7.9-jumbo-5 (released last year).  So the demonstrated
speedup over CPU is only 4x then? ;-)

Alternatively, we need to be talking about ease and costs of building
multi-GPU vs. multi-CPU systems.  Yes, GPUs have additional advantage
here.  Systems with 8 GPUs (four dual-GPU cards) are affordable to
hobbyists, whereas servers with 8 CPUs are a lot pricier (even if the
individual chips cost the same to produce).  That's so for marketing
reasons, I think.  On the other hand, machines with mostly-idle CPUs
tend to be readily available at any a given company (and in large
quantities), whereas high-end GPUs would need to be specifically
purchased for the password cracking.  So the comparison is not simple if
we try to consider these things.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.