john-users - .chr files (Was: automation equipped working place of hash cracker, proposal)

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BLU0-SMTP369AEFE28A6C18B4C1FDFC4FD3B0@phx.gbl>
Date: Fri, 13 Apr 2012 20:51:31 +0200
From: Frank Dittrich <frank_dittrich@...mail.com>
To: john-users@...ts.openwall.com
Subject: .chr files (Was: automation equipped working place of hash cracker,
 proposal)

On 04/13/2012 08:08 PM, magnum wrote:
> On 04/13/2012 04:39 PM, Aleksey Cherepanov wrote:
>> It is common to rebuild chr files to improve incremental mode having some
>> passwords cracked.
> 
> This is common and often very rewarding. What we should not forget
> though, is that this will emphasize the errors we made in the first
> case. Suppose we crack 30% of the passwords but for some reason we
> almost always miss character 'z' (in real life it may be a handful or
> more of 8-bit or UTF-8 characters) which (very) theoretically could be
> present in 50% of the total. After rebuilding chr-files we are
> amplifying this error and will try even fewer (perhaps none) candidates
> containing character 'z'. And so on.

Optimal usage of incremental mode indeed is a complex topic.

Other problems with repeatedly generating new .chr files are:

1. If you already used incremental mode with another .chr file for a
while before you build a new .chr file and restart incremental mode with
the new file, you'll inevitably try a certain amount of candidate
passwords again which have already been tried before.
This is even more the case if you repeatedly recreate new .chr files
based on passwords cracked previously.

A solution could be to generate a .chr file once after a reasonable
amount of passwords have been cracked.
If you later on generate new .chr files, you can start new incremental
mode sessions and run them as long as they are effective (due to newly
discovered important tri-graph character sequences, compared with the
previously created .chr files.
But when the new incremental mode session gets less effective, it might
be better to continue using the older incremental mode session which
already covered a larger part of the total key space.

2. If you detect a pattern like passwords based on dates, e.g.
12/10/1989, and you try all candidate passwords of this pattern, you
should filter out all passwords of this pattern before generating a .chr
file.
Otherwise your .chr file will be biased, and password candidates of the
pattern DD/MM/YYYY or MM/DD/YYYY will become more likely, even if none
of those passwords will crack any remaining hashes.
The same applies if someone already completed incremental mode with
digits.chr.
In this case, you should filter out all passwords consisting only of
digits, to avoid a  bias towards digits which will not be justified.

After thinking about it again, this might be at least similar to the
point magnum made:
If you do have a bias in the passwords which serve as input for
generating a .chr file, using this .chr file for incremental mode will
increase that bias. If this bias doesn't exist in the passwords which
are still uncracked, you'll have generated a less than optimal .chr file.

For the same reason, you might want to reduce the impact frequently used
passwords will have on your .chr file.
E.g., if you crack DES password hashes, and you have 4096 different
hashes for "password" on your .pot file, 8 character passwords starting
with "p" will be over-represented in the password candidates tried
first, and so on.
For that reason, I once experimented with Creating a dummy .pot file
with each password appearing just
(1 + ln(password_count_in_original_pot_file)) times.
Those passwords which occur thousands of times usually will have been
tried before you start incremental mode.

3. Generating a .chr file which is appropriate for different hash
algorithms is very hard.
Not just because you'll probably have tried a larger set of patterns
like DD/MM/YYYY or digit-only passwords for fast, saltless hashes, but
also because password hash algorithms have different properties like
maximum password length, usage of 8 bit characters, distinction of upper
and lower case characters.

Frank
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.