Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <D32BDFD8-7E3C-483C-BF29-761991F875F9@sl-chat.de>
Date: Wed, 3 Feb 2010 14:04:25 +0100
From: SL <auditor@...chat.de>
To: john-users@...ts.openwall.com
Subject: Pre-Mangling (Wordlist cleanup)

I would like to use ./john --rules=Pre-Mangle --stdout | ./ unique to  
clean up arbitrary (large) "dirty" wordlists.

In other words: I have target-specific generated wordlists (of about  
2GB size), which still contain a lot of "unusable junk" like raw MD5  
hashes, punctuation, Base64 fragments, QP-encoded fragments, falsely  
decoded UTF-8 etc.

My intention is to put together a number of word mangling rules that  
help to reduce this chaos and only let through "reasonable" candidates  
for future processing with ./john --rules and ./john --rules=Single.

(My currently running "--rules=Single" session on that 2GB list has  
got an ETA of mid-November 2010 (salted raw MD5 hashes, JimF's patch).)

Does such a collection of rules already exist? I couldn't find one,  
and I must admit that the complexity of http://www.openwall.com/john/doc/RULES.shtml 
  is a bit too much for me to start from scratch.

What it should accomplish:
* obviously no no-op (:)
* include "dictionary-like" words up to à certain length (haven't seen  
any password longer than 18 chars in my samples, so lenght 22 should  
probably be sufficient)
* shorter alphanumeric "words" might be included as-is, maybe up to 8  
or 10 chars
* punctuation should probably be purged (or truncated?)
* words with false transcodings (lots of /(.[ÂÃ])+/) should get  
rejected

Could anybody please point me to a reasonable start? I shall follow-up  
with a patch to john.conf, if this idea proves succesful.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.