john-users - Re: [openwall-announce] Energy-efficient bcrypt cracking (Passwords^13 slides)

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <005a01cef06e$da12ff70$8e38fe50$@net>
Date: Tue, 3 Dec 2013 22:30:13 +0100
From: "Jeroen" <jeroen@...sman.net>
To: <john-users@...ts.openwall.com>
Subject: Re: [openwall-announce] Energy-efficient bcrypt cracking (Passwords^13 slides)

Interesting presentation!

It might be worth considering testing the Intel Xeon E3-1265L v2/3 (4c/8t).
About -10% CPU power compared to a Intel Core i7-4770K (according to
'normal' benchmarks) but using only 45 Watt TDP(!). Might be a winner in the
x86 arena.

Cheers,

Jeroen

-----Original Message-----
From: Solar Designer [mailto:solar@...nwall.com] 
Sent: Tuesday, 03 December, 2013 02:26
To: announce@...ts.openwall.com; john-users@...ts.openwall.com
Subject: [openwall-announce] Energy-efficient bcrypt cracking (Passwords^13
slides)

Hi,

Continuing (beyond) her GSoC project with Openwall, Katja Malvoni has
presented a comparison of bcrypt cracking efficiency achieved on chips based
on Epiphany many-core architecture and on an FPGA vs. common CPUs, GPUs, and
Intel MIC (Xeon Phi).  Katja has delivered the talk (including a live demo)
at PasswordsCon Bergen, and we've placed the slides online:

http://www.openwall.com/presentations/Passwords13-Energy-Efficient-Cracking/

To summarize, bcrypt cracking is still very slow, but Epiphany and FPGAs
achieve much higher speeds per Watt, which enables higher density (more
chips per board, more boards per system).  While Katja's implementation of
bcrypt cracking on Epiphany is nearly optimal, there's still a lot of room
for optimization for FPGAs, so we expect to continue this project.

While this is primarily Katja's project and talk, I ended up joining as a
co-author for the presentation, to provide baseline performance and power
usage figures for CPUs, GPUs, and MIC.  While working on this, I optimized
JtR's bcrypt code for x86-64 some further, implementing 3x interleaving,
which helped some CPUs achieve higher speeds.  I also produced and tried out
revisions of Steve Thomas' bcrypt/AVX2 code on Haswell and, with further
changes to move to MIC intrinsics, on Knights Corner (results: no luck
getting any speedup over optimized scalar implementations).  The 3x
interleaving will likely get into an upcoming JtR release, although it might
not be enabled by default (unfortunately, there are also CPUs where this
change hurts performance a little bit).

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.