john-dev - Re: --fork

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130507202634.GB18836@openwall.com>
Date: Wed, 8 May 2013 00:26:34 +0400
From: Solar Designer <solar@...nwall.com>
To: Tavis Ormandy <taviso@...xchg8b.com>
Cc: john-dev@...ts.openwall.com
Subject: Re: --fork

Hi Tavis,

On Tue, May 07, 2013 at 08:29:48AM -0700, Tavis Ormandy wrote:
> Hey Solar, if I understand correctly this skips the inc_key_loop step on
> alternate nodes during do_inc_crack to distribute the work (similar in
> concept, but less overhead than the parallel external mode).
> 
> e.g.
> 
> for (i = 0; i < n; i++) {
>  if (this_node % i == 0)
>    inc_key_loop()
> }

You mean something like "i % number_of_nodes == this_node".  Yes, that's
what we do.  It's the same thing we did with MPI in jumbo.

> I have an algorithm to skip to an arbitrary state in constant time without
> having to increment it, would you be interested in a patch against core?

I do remember your work on this, thanks!  However, your algorithm is
probably not usable for the new incremental mode as-is (with
per-position character counts growing separately from each other) - it
will need slight changes.  I actually wanted to look into doing that for
1.8, but then decided to leave that for post-1.8 to avoid scope creep
for 1.8 (it's time to release 1.8 already!)

Unfortunately, this does mean that some future versions will need to
support both approaches for a while (the approach currently implemented
for 1.8 will still be needed for restored sessions).  I think I'll
increase "compat" from 2 to 3 along with making the change, and some
versions of JtR will need to support both 2 and 3 in restored sessions
for a while.

As to a patch to a core file like this, this immediately brings up
licensing concerns. %-)  I recall that you preferred to use GPLv2+ for
rawSHA1_ng_fmt.c, and I respect that decision of yours, but for a core
file I would not accept a GPL'ed change copyrighted by someone other
than me.  I would also not want to relax the license to inc.c for the
general public just yet (maybe later).  This means that accepting a
patch to it gets tricky: we'd need to introduce some "license with right
to sublicense" scheme (or copyright transfer, but I dislike that for a
number of reasons).  I am to think of how to do this best and with
minimal complexity for everyone; at this time, though, I merely need to
push 1.8 out with its inc.c almost as-is.

> I guess it would be more invasive, but would save a lot of cycles
> (especially on a large number of nodes).

I think you recall incorrectly.  The simpler approach that we currently
have does not waste cycles.  It does not skip individual passwords - it
skips large blocks of them, quickly.  A real problem with it is that
those blocks may be too large.  This limits scalability, yes - simply
because there are too few such blocks for a large cluster.  Another
problem is that those blocks are not of same size, so when/if endgame is
reached, some nodes terminate much sooner than others (even if they're
same speed).  Yet another problem is that with a lot of nodes the order
in which candidate passwords are tested becomes significantly
non-optimal.  All of this will definitely need to be dealt with in a way
similar to what you describe - before or when we proceed to add better
distributed processing (not just --node and MPI, but also option for
MPI-less communication between the nodes, which is planned).

In fact, as I think I had mentioned to you before, I had a similar in
concept thing implemented in an early prototype of incremental mode only
distributed cracker back in 1997, which I never brought to "release
quality".  It too would split portions of individual "entries", rather
than simply distribute the entries to nodes.  Incremental mode was a bit
simpler at the time, but it already had this order[] thing.

> I wrote about it here: http://www.openwall.com/lists/john-dev/2012/06/14/5

Yes, thanks!

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.