john-dev - Re: Password Generation on GPU

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120430105849.GA8024@openwall.com>
Date: Mon, 30 Apr 2012 14:58:49 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Password Generation on GPU

On Mon, Apr 30, 2012 at 10:57:36AM +0200, Frank Dittrich wrote:
> On 04/30/2012 06:32 AM, Solar Designer wrote:
> > Wordlist mode with rules may potentially be able to use set_mask() for
> > ruleset lines containing portions like Az"[190][0-9]".  However, that
> > would be bad in two ways: it would confuse the rule preprocessor with
> > the actual rule processor (making these things even more difficult to
> > explain than they're now) and it would swap the words vs. rules
> > processing order for the affected ruleset lines
> 
> If you have a file with word sorted by priority/popularity, this might
> even be desired.

Indeed it is desirable to have a "rules first" mode, where each input
word is passed through all rules before the next word is tried.  In
fact, it is desirable to have a mixed mode as well, where rules are
tried either in groups (identified as such in the ruleset, which may
allow for caching and reuse of intermediate results) or where JtR
advances in both the ruleset(s) and the wordlist(s) incrementally (e.g.,
it may have tried 10% of rules against 10% of input words in 1% of total
runtime, then 20% of rules against 20% of words in 4% of total runtime).
This is getting off-topic for the current thread, though. ;-)

> A good alternative to (optionally) reversing the order of rules and
> words to be processed could be to provide a way to specify a range of
> input words, say --from=1 --to=1000.
> (For --from, 1 should be the default, the default for --to should be the
> number of input words in the file.
> 
> I think this option is very useful, even if most users will ignore them.
> (The --from and --to values should correspond to line numbers in the
> file, so that due to skipped comments, you might end up with fewer words
> being used.
> The other alternative, --from=1 means starting from line 12 of
> password.lst, will probably be even more confusing.)

Well, maybe these options would make some sense for very large
wordlists, where having portions of them written to disk would be a
major annoyance (if using the head/tail commands to achieve the same).

> > Instead, we may consider introducing the ability for rules to produce
> > multiple candidate passwords.  Right now, each rule (as output by the
> > preprocessor, when applicable) produces at most one candidate password
> > for one input word.  There are good reasons (not limited to GPU
> > acceleration) to allow for rules to produce multiple candidate
> > passwords.
> 
> Yes, this would work as well.
> But how does the user know when a rule will produce multiple candidate
> passwords, and when there are just several rules which produce just one
> candidate password (or even no password, if the rule or the word will be
> skipped)?
> Won't this be confusing for the users?
> Do you want to introduce new rule names here (or are you running out of
> characters for new rules)?

Any additions to the rules capabilities and syntax may be confusing,
but this doesn't mean that we should never add anything.  We should just
be careful with what we add - consider the benefits vs. confusion.

> > So we may add some syntax that would do just that - e.g.,
> > reuse curly braces for the purpose
> 
> OK, the curly braces for this new usage can easily be distinguished from
> the { and } rules (shift).
> But there could be some instances of '{' meaning the character '{' where
> rules that were correct in the past might become incorrect, or where
> rules change their behavior. (Fortunately, '{' and '}' are not used that
> frequently in their meaning as plain characters, so this shouldn't be
> that much of a problem.)

I thought that we would in fact make an incompatible change here, in a
major release of JtR.  An alternative to this would be to go with syntax
like \{ and \} to access the extension - much like it's done with
extended regexps.  This avoids compatibility breakage short term, but is
more confusing long term.

> If we allow a rule to generate multiple passwords, we should also think
> about enhancing --external filters in a similar way.

Definitely.  This is on my to-do, and it's mostly unrelated to possible
multi-output rules.  The only way it is related is through another idea
of mine: be able to invoke external mode functions from rules.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.