john-users - Re: KoreLogic rules

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101008232125.GA18931@openwall.com>
Date: Sat, 9 Oct 2010 03:21:25 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: KoreLogic rules

Minga,

Thank you for your response!  I was afraid I could have unintentionally
offended you by once again starting with edits to the rules without
actually using or integrating the rules yet... but I felt that I needed
to start in this way.

On Fri, Oct 08, 2010 at 03:45:26PM -0500, Minga Minga wrote:
> On Sun, Sep 26, 2010 at 5:46 AM, Solar Designer <solar@...nwall.com> wrote:
> > One assumption I made when adding constraints to the rules was that if
> > we have a certain complicated rule, we also have all obviously-simpler
> > rules - e.g., if we have a rule that appends two characters of a certain
> 
> I can explain this. For our jobs, we need to crack as many password as
> possible. One method of doing this, is to pick a single wordlist, and
> then running multiple 'john' processes each with a different --rules:
> setting.  It is assumed that _all_ of the "KoreLogicRules" will eventually
> be run. So if you are running a "complex" rule, it is assumed that you are
> also running a "simpler" rule on another machine/core.  I have shell scripts
> (That I'm about to share) that does all this work for you.

Thank you for the explanation.  Yes, it'd be great to see your shell
scripts.  Also, you mention some specific wordlist filenames along with
these rules (IIRC, you do so on the contest website and in comments in
the ruleset), yet you did not appear to make those publicly available -
maybe you could release them now? or provide instructions on how to
generate them (for generated ones)? or did I miss them?

That said, I think I was not clear enough.  Your explanation above
essentially agrees with the assumption I made and the constraints I
added to the rules.  However, while reworking your rules, I found that
in some cases you did _not_ in fact have a "simpler" rule corresponding
to some "complex" rules that you did have.  So this assumption did not
always hold true for your ruleset.  Thus, I assumed that either you had
other rules as well (unreleased, JtR's default, and/or third-party
rulesets?) that you always ran along with those you released, or you
actually inadvertently(?) missed some simpler rules while including
their more complex counterparts.

> Here is a quick example:
> 
> If you notice your client is using the seasons in a majority of their passwords
> (likely due to password rotation being every 3 months).
> I would do something like this: (excuse the long line and poor use of 'cut')
> 
> #  for rule in `grep :KoreLogicRules john.conf | cut -d: -f2 | cut
> -d\] -f1`; do echo Rule = ${rule}; ./john --rules:${rule} --fo:nt
> -w:seasons.dic pwdump.txt; done
> 
> This is not _actually_ what I do - but it gives you an idea, and it works.

Thanks.

This example, if I understood it correctly, shows that you're using
short wordlists and rules interchangeably - you have seasons append and
prepend rules, but when seasons are very common you can also have
seasons in a wordlist and run them with all other rules.

Maybe JtR needs a builtin capability to have rules applied twice (or N
times), recursively, although so far I've been doing it with multiple
runs (usually with "unique" inbetween).  Also, maybe its current
hard-coded logic to auto-reject empty "words" should be made optional -
such that e.g. your seasons append rules could in fact be applied to an
empty input string initially.

This made little sense with relatively slow Unix password hashes, but
things are changing now that very fast non-iterated hashes are still in
use, yet computers keep getting faster.

> What I _actually_ do is - put these command lines in a queue - and send that
> one queue, one command at a time, to a large list of UNIX systems
> (each with multiple 'cores'). Using this method, I can run 48 'john' processes
> at once with (theoretically) very little wasted effort (assuming the rules
> do not over lap with each other - which some of them do on purpose).
> When I use this method, the wordlists and rules are sorted by priority level
> and the rules that are more likely to be successful are run before less-likely
> rules (such as Append6Num which is a dog and is slow).

Can you publish this info on ruleset priorities?  In fact, I'd prefer to
have this info for all rulesets with the same wordlist, and then separately
for {ruleset, wordlist} pairs.  Not having this is one of the things that
prevents possible inclusion of many KoreLogic-derived rules with JtR.
I could obtain such info on password hash sets I have access to (as well
as on the RockYou list), but they would be very different from "your"
corporate passwords - e.g., I have almost no data on passwords with
month or season names in them, yet you're saying that they're common in
your samples (due to password expiration policies, as you have explained).

> Admittedly, some of the 'reworked' rules are harder to read. But I think its
> a good idea to use the preprocessor as much as possible.

Some of them are harder to read not because of more extensive use of the
preprocessor, but because of me adding more constraints into the rules
(to reduce duplicates, considering length-limited and case-insensitive
hashes as a possibility), which I didn't have to do.

> **EXCEPT** that
> I have started using 'hashcat' and 'oclhashcat' a lot recently, and rules that
> take advantage of the preprocessor are not understood by it. The way around
> this is to use john.log 's output to make "uglier" rules that don't use the
> preprocessor.

Maybe I should add a feature into John where it would output
preprocessed rules in a form more convenient than john.log, although I
don't want to have too many command-line options...

What are your specific reasons to use hashcat, though?  What exactly
does it do better than JtR for you?  (I've never used it so far.  I am
aware of a few things, but perhaps not of all.)  If your answer is going
to be long, then perhaps post it in a separate message, changing the
Subject accordingly.

> One question I have (and can answer myself later) is that do these
> re-written rules run faster?

It depends.

Besides making more extensive use of the preprocessor, I added many new
constraints to the rules.  Many of those added constraints have per-word
cost (e.g., checking against the target hash type's "cut length"), and
all have per-rule cost (which matters when the number of rules is large
and the wordlist is tiny).

So when you run all rulesets at once
(such as korelogic-rules-20100801-reworked-all-2.txt) against NTLM
hashes (fast) and you have a truly tiny wordlist (e.g., just one word
"defcon", as I did in one of my tests), then there's overall slowdown.

On the other hand, with LM hashes (case-insensitive and length-limited)
or with some Unix crypt(3) hashes (maybe length-limited, slow, and
salted), there's overall speedup, because the added constraints help
reduce the number of effective-duplicates being hashed.

With a larger or at least not-so-tiny wordlist, things should be better
for fast and "unlimited" hashes like NTLM as well, although I did no
such benchmarks.  (I did some test runs like this, but I did not compare
the timings.)

Then, a few of the rulesets should in fact have become faster in almost
all cases.  This is due to some commands replaced with simpler ones
(specifically, when the "A" command was used to append/prepend/insert
just a single character, this was better done with the "$", "^", or "i"
command as appropriate; "A" is intended for 2+ character strings).

Finally, I tried to stay close to your original rules for now, but I
think that we need to deviate from them a bit further, which would help
reduce the overhead of the added constraints (they would need to be
different then).  For example, in KoreLogicRulesL33t you're applying
substitutions to the input line as-is, then you either keep the result
as-is or capitalize it.  JtR's default "single crack" ruleset instead
starts by converting the input line to all-lowercase, which reduces the
runtime cost of dupe-avoidance.  I think we need to do the same here.
The "M" and "Q" commands, which I used to add the constraint to your
rules, are quite costly (so they're beneficial for slow and/or salted
hashes, but maybe not for fast and saltless ones).

> Or is the main reason for using preprocessor
> rules just to limit the amount of lines in john.conf?

There were several reasons: reduce the number of lines (as well as file
size), make patterns more apparent and errors/omissions harder to make
(I think I found some of the bugs due to my use of preprocessor
constructs), make edits easier, allow for the introduction of
constraints for case-insensitive hash types that would otherwise require
even more lines (than you had).

> P.S. We are pretty certain that "Crack Me If You Can" will be at next
> year's DEFCON. Start planning now! ;)

Sounds great.  I have plenty of ideas on what to improve in JtR in
preparations for this, but unfortunately I have too little time... so
only a fraction of the ideas will be implemented.

Thanks again,

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.