Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BLU0-SMTP679C9B315A34722B4B678DFD270@phx.gbl>
Date: Sat, 5 Jan 2013 14:29:45 +0100
From: Frank Dittrich <frank_dittrich@...mail.com>
To: john-users@...ts.openwall.com
Subject: Re: Incremental attack properties questions

On 01/05/2013 01:11 PM, Frank Dittrich wrote:
> Since Markov mode generates words based on 2-byte-frequencies, and since
> it generates passwords shorter than maximum length, there will be a
> non-neglectable number of words with invalid utf-8 characters,
> especially at the end of the word. So you might need to combine --markov
> with an --external filter.

If you don't want to write a general-purpose utf-8 validity check, but
just one which checks --markov output based on stats files which have
been generated using a word list encoded in (valid) UTF-8, then this
task is quite simple:

If the last byte is < 0x80, the word is valid.
Else if the last byte is > 0xbf, the word is invalid.
Else if the second to last byte is >= 0xc0 and <= 0xdf, the word is valid.
Else if the third to last byte is >= 0xe0 and <= 0xef, the word is valid.
Else if the forth to last byte is >= 0xf0 and <= 0xf7, the word is valid.
Else the word is invalid.

Frank

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.