Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20051017084227.GA4280@openwall.com>
Date: Mon, 17 Oct 2005 12:42:27 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Using Hardwareaccelerators to speed up John

On Sun, Oct 16, 2005 at 06:44:09PM -0700, h1kari wrote:
> I would like to initially comment that I have more detailed information
> on my work on a website I just put up: http://www.openciphers.org
[...]
> I know that some of the statistics in my older presentations were a bit
> off. Currently right now on our (Pico E-12) LX25 boards we are able to
> clock our design at either 125MHz with one DES core cracking 128 hashes
> in parallel or the core instantiated 4 times cracking one hash at 125MHz
> * 4 per second.

Yes.  So that's up to 500M c/s at LM for a single chip, but only when
cracking one or two hashes (according to a table on your website).
That's very impressive.

However, reading it another way, the key rate becomes 4 times lower
with the number of hashes loaded for cracking increased to 128, and you
currently do not even support loading more than 128.  Is that correct?

If so, starting with a few thousand LM hashes to crack, John running
on a single modern CPU would outperform your implementation on a single
Pico E-12 LO.  Is that correct?

> For Unix DES, it would essentially be the Lanman
> performance / 25, since Unix DES requires 25 rounds, so the max
> performance of our card is currently ~50M c/s, which is a little less
> than my projected number in the slides.

How do you derive that ~50M c/s figure?  500M c/s at LM would translate
to roughly 20M c/s at traditional crypt(3), no?

That's still very impressive indeed.  However, the limitation on the
number of hashes which may be cracked at a time is really unfortunate.
Perhaps it will be easier to overcome for slower hashes such as crypt(3)
where you would not make things a lot slower by having the comparisons
actually take a few extra cycles.  (And you would probably not have too
many hashes with the same salt anyway.)

> > It's not only about c/s
> > rates, but also about the order in which candidate passwords are tried.
[...]
> > A similar problem exists with testing the computed hashes against a
> > large number of those loaded for cracking.
[...]

> Yeah. I'm sorry it ended up coming out comparing directly to the
> functionality of John. The idea I was trying to get across was that when
> most people think of password cracking, they think of john, and I was
> doing something similar.

The comparison is fine, but a few clarifications such as those I've
given would be in order. ;-)

> Ideally, I'd like to have the FPGA act as a
> hardware accelerator plugin for John and be able to directly enhance the
> speed of checking based on intelligent wordlists.

That would be great.

> Right now our only
> nitch with this project is for passwords that can't be easily cracked by
> John or L0phtcrack.

Understood.  Is there much legitimate demand for that, though, when
we're talking OS login passwords?  Penetration testing?

> Anyway, I definitely didn't mean to knock the work you guys are doing. A
> lot of the benchmarking stuff is a little murky and it was hard to find
> specific benchmarks from the different open source projects. If you guys
> could provide some specs, I would really like to setup a page that
> provides accurate performance information from all of the projects
> including rainbowcrack/l0phtcrack/etc, or maybe there's a resource
> already out there for that.

I am not aware of such a resource with accurate and up-to-date
information.  For John specifically, I'd be happy to provide any
performance numbers you may be interested in.

> As far as future work. We've been doing a lot of research with the
> Virtex-4 FX cards and the onboard PowerPCs and we see a lot of potential
> for using the APU bus to provide custom instructions to software (john)
> that would allow you to accelerate your DES and other functions with
> single instruction calls.

I've checked out this webpage:

http://www.xilinx.com/products/silicon_solutions/fpgas/virtex/virtex4/capabilities/powerpc.htm

and, as far as I understand, the APU bus is internal to the FPGA chip,
so are you suggesting to run an embedded Linux and John on one or both
of the embedded PowerPCs?

> I don't know how much this would speed up john
> considering the onboard PowerPCs can only be clocked up to 450MHz, but
> it seems like it would at least be a bit of a speed improvement over
> doing the crypto in software. Your comments on this would be really
> appreciated.

Yes, this would probably result in at least some speedup for some hash
types, however it'd be tricky to manage (e.g., no host filesystem access
from the embedded OS and vice versa), and not being able to use the
host's more powerful CPU(s) and bigger memory sounds like a waste.

I think it'd be better to continue running most of John on the host
system, but have it communicate with its low-level parts in the FPGA.
It is not obvious whether the embedded PowerPCs would allow to simplify
or speedup such communication.  A possible use for them would be to
implement the logic of weird algorithms such as the FreeBSD-style
MD5-based crypt(3), leaving the precious logic cells to more instances
of MD5 itself.

> Also, if we were able to provide the hardware end of this to you guys,
> would you be able interested in tying it into john?

I personally would definitely be interested in doing that.  However, I
wouldn't be able to dedicate a lot of time to it unless the project
would also be (expected to be) successful commercially.  (This is a
topic we can discuss off-list.)

Perhaps we could start by having the FPGA card do the bare minimum and
having John running on the host system communicate candidate passwords
and salts to the card and computed hashes back.  It won't be very fast,
but it's likely the quickest way to get us started.  We can do it for
just one hash type initially (perhaps one of the Unix hashes since we
need it to not be very fast).

Thanks,

-- 
Alexander Peslyak <solar at openwall.com>
GPG key ID: B35D3598  fp: 6429 0D7E F130 C13E C929  6447 73C3 A290 B35D 3598
http://www.openwall.com - bringing security into open computing environments

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.