john-users - Re: all.lst in English only

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070605222217.GA28155@openwall.com>
Date: Wed, 6 Jun 2007 02:22:17 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: all.lst in English only

On Tue, Jun 05, 2007 at 10:12:16AM +0100, Evo Eftimov, iSec Consulting, www.isecc.com wrote:
> ... the Wordlist CD contains an English dictionary consisting of the
> following parts/subdirectories:
> 
> Tiny, Small, Large and Extra

For others reading this:

These wordlists can also be freely downloaded from under /pub/wordlists
(or equivalent) on Openwall FTP archive mirrors listed at:

	http://www.openwall.com/mirrors/

> My question is whether they subsume each other or work independently of each
> other - for example does small contain tiny, does large contain small etc

For all languages with this kind of split wordlists (not just English):

"large" contains all of "small",
"small" contains all of "tiny",
however "extra" (if present) does NOT have entries in common with any of
"tiny", "small", or "large".

In other words, "extra" is just that - questionable entries that did not
qualify for smaller wordlists, and only those entries.

Please note that all wordlist files in this collection start with
comments that list input wordlist files that were used to generate each
larger file.

> Because if this is NOT true the following solution can be applied:
> 
> cat password.lst, English/tiny/*, English/small/*, English/large/*,
> English/extra/* > all_english.lst
> 
> john --wordlist=all_english.lst --rules --stdout | unique
> mangled_english.lst

Yes, it would work fine (with the erroneous commas on the "cat" command
dropped and "cat" replaced with "zcat"), regardless of whether or not
there are any duplicate entries in the input wordlists due to your use
of "unique".

However, I recommend this:

	zcat passwords/* languages/English/3-large/* languages/English/4-extra/* | grep -v '^#!comment:' | unique English.lst

or:

	zcat passwords/* languages/English/3-large/* languages/English/4-extra/* | grep -v '^#!comment:' > English-dupes.lst
	john --wordlist=English-dupes.lst --rules --stdout | unique English-mangled.lst

For slow hash types, you might want to drop the "extras".

Another improvement may be to apply the mangling rules to "passwords"
lists first, then do it for "tiny", then for "small", and finally for
"large" and maybe "extra" - and combine all of the results with "unique".
The total number of entries will be the same, but the order might be
more optimal.  A similar approach has been used for mangled.lst included
on the CD.

> tr A-Z a-z < mangled_english.lst | sort -u > mangled_english_sorted.lst

This is not a good idea.  The documentation recommends this command "for
use with default wordlist rule set" - that is, to be applied prior to
the rule set.  Also, it is only needed when you're not using "unique",
to reduce the number of duplicates that word mangling rules produce.
When you are in fact using "unique", you're better off not converting
your input wordlists to all-lowercase as there are some entries where
the original case of characters is valuable.

Perhaps the documentation is confusing and needs to be corrected.

> john --wordlist=mangled_english.lst mypasswd

Right.  Just don't forget to run "single crack" and "incremental" modes
as well.

> MINLENGHT and MAXLENGHT in the default john.conf seem to be relevant only to
> Incremental mode

Correct.  In fact, they're specified within "incremental" mode sections
(oh, and they're called "MinLen" and "MaxLen").

> - what is the way to make them valid for wordlist mode.

There's no way to do exactly that.  Instead, word mangling rules may be
specified that would reject words that fall outside of the desired
length range.  If you're going to use your pre-mangled wordlist (and
thus don't need the mangling rules anymore), you can replace the
"[List.Rules:Wordlist]" section contents with just this one line:

	>3<6

With the above example, if you run John with a wordlist and "--rules",
it will only use entries that are 4 or 5 characters long.  The rule will
also apply if you set "Wordlist = ..." in john.conf to point to your
pre-mangled wordlist and run John with no options at all (letting it do
"single crack", then pre-mangled wordlist with the above length
restriction, and finally "incremental" mode).  When you do in fact have
a pre-mangled wordlist, you can as well apply the length restrictions
just once, save the resulting file, and use that - with no need for any
filtering to be done on further invocations of John.  You can even do
the filtering prior to "unique", which is likely to reduce the total
processing time (because "unique" will have less work to do).

Why would you want to do that, though?  I don't recommend it.  Perhaps
you have your reason, and once you explain it I might be able to suggest
something better.

-- 
Alexander Peslyak <solar at openwall.com>
GPG key ID: 5B341F15  fp: B3FB 63F4 D7A3 BCCC 6F6E  FC55 A2FC 027C 5B34 1F15
http://www.openwall.com - bringing security into open computing environments

-- 
To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply
to the automated confirmation request that will be sent to you.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.