john-users - Re: Incremental attack properties questions

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BLU0-SMTP369BCD39CC7B7D7B7996F28FD270@phx.gbl>
Date: Sat, 5 Jan 2013 13:00:29 +0100
From: Frank Dittrich <frank_dittrich@...mail.com>
To: john-users@...ts.openwall.com
Subject: Re: Incremental attack properties questions

On 01/05/2013 09:58 AM, JohnyKrekan wrote:
> 2. What do I need to create a alpha.chr file for passwords derived from Slovak language?

In addition to what magnum wrote:

You should name your .chr file differently, e.g. slovakian.chr, and not
overwrite the alpha.chr file.

You can create a dummy john.pot file (just preceding every slovakian
word in your word list with a colon) which can then be used to build a
new .chr file.
Read doc/EXAMPLES for reference how to use ./john --make-charset=

Then, you define a new section (either in john.conf, or in
john.local.conf, if your john version is recent enough that this file
exists):

[Incremental:Slovakian]
File = $JOHN/slovakian.chr
MinLen = 1
MaxLen = 8
CharCount = 26

Adjust MinLen to your desires minimum length.
CharCount is optional.

If your dictionary required non-ascii characters to be useful, you may
be out of luck with incremental mode, unless you are willing to patch john.

Check the output of
./john --list=build-info:

(unstable-jumbo)run $ ./john --list=build-info
Version: 1.7.9-jumbo-7+unstable
Build: linux-x86-native
Arch: 32-bit LE
$JOHN is ./
Format interface version: 9
Rec file version: REC3
Charset file version: CHR2
CHARSET_MIN: 32 (0x20)
CHARSET_MAX: 126 (0x7e)
CHARSET_LENGTH: 8
Max. Markov mode level: 400
Max. Markov mode password length: 30
Compiler version: 4.6.3 20120306 (Red Hat 4.6.3-2)
gcc version: 4.6.3
OpenSSL library version: 10000003
GMP library version: 4.3.2
NSS library version: 3.13.6.0	(loaded: 3.13.6.0 Extended ECC)
NSPR library version: 4.9.2
Kerberos version 5 support enabled

If UTF-8 encoding is required for your hash format, the first limiting
factor for incremental mode will be
CHARSET_LENGTH: 8
(This length is in bytes, so this translate to an even shorter length in
characters.)
CHARSET_MAX: 126 (0x7e)
This will exclude any non-ascii characters, no matter what encoding you use.

Probably, for cracking Slovakian passwords, you are better of using
Markov mode instead of incremental mode, even if Markov mode also
requires a little adjustment to work with non-ascii characters.

Even if you would get incremental mode working with non-ascii
characters, the incremental mode would sooner or later generate byte
sequences which are not valid utf-8 characters.
(This shouldn't happen with Markov mode, provided you generate your
custom stats file with valid input. There's just one exception if a byte
sequence for a non-ascii character at the end of the word gets cut off
due to maximum length or maximum Markov level limits.)

Read doc/MARKOV for all the details I skipped.
I am sure it will be worth it.

>From a quick glance at the code, calc_stat.c needs
#if 0
#endif
around two if ... { ... } blocks.

The first one starts at line 92:
			if(C2I(ligne[i])>127)

The second starts at line 104:
			if((i>0) && (C2I(ligne[i-1])>127))

Otherwise, non-ascii characters would be skipped when generating the
custom stats file.
But after this change, everything seems to work.

I just used a few lines with German umlauts and a few other special
characters here. Make sure you use a reasonable word list instead.

(unstable-jumbo)run $ cat fdtest
äö
äöü
äöüß
Äää
Ööö
ÜüÄöß
§ößäüößÄÖÜ²³¼
²³
²³¼ä


(unstable-jumbo)run $ ./calc_stat fdtest fdtest-stat
zero -10*log proba2[132*256+195] (3) / 3, converted to 1 to prevent
infinite length candidates
zero -10*log proba2[150*256+195] (2) / 2, converted to 1 to prevent
infinite length candidates
zero -10*log proba2[159*256+195] (2) / 2, converted to 1 to prevent
infinite length candidates
zero -10*log proba2[164*256+195] (5) / 5, converted to 1 to prevent
infinite length candidates
zero -10*log proba2[167*256+195] (1) / 1, converted to 1 to prevent
infinite length candidates
zero -10*log proba2[178*256+194] (3) / 3, converted to 1 to prevent
infinite length candidates
zero -10*log proba2[179*256+194] (2) / 2, converted to 1 to prevent
infinite length candidates
zero -10*log proba2[182*256+195] (6) / 6, converted to 1 to prevent
infinite length candidates
zero -10*log proba2[188*256+195] (4) / 4, converted to 1 to prevent
infinite length candidates

Don't worry about these warnings.

Then I added a new section to my john.conf file immediately after the
[Markov:Default] section:

[Markov:fdtest]
.include [Markov:Default]
Statsfile = $JOHN/fdtest-stat


Adjust MkvMaxLen as well. Keep in mind that the length is in bytes.
(Non-ASCII characters in UTF-8 encoding need more than one byte.)

(unstable-jumbo)run $ ./john --markov=fdtest --stdout
MKV start (stats=$JOHN/fdtest-stat, lvl=200 len=12 pwd=288809)
öööööö
öööööä
öööööß
öööööü
öööööÄ
öööööÖ
öööööÜ
ööööö
ööööö
ööööäö
ööööää
ööööäß
ööööäü
ööööäÄ
ööööäÖ
ööööäÜ
ööööä
...

Even if this looks like there must still be a bug somewhere, because my
input file doesn't contain a word starting with ö:
Keep in mind that the Markov mode works based on byte frequency information.
The UTF-8 representation of ä and ö start with the same byte.
Because ö occurred more frequently than ä in the input file, Markov mode
starts generating "words" which begin with ö.
If you use a real word list (or a list of real Slovakian passwords or
pass phrases, this shouldn't be a problem.

Frank
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.