|
Message-ID: <BLU0-SMTP314CB12F5E0CB39173AB326FD260@phx.gbl> Date: Sun, 6 Jan 2013 11:10:03 +0100 From: Frank Dittrich <frank_dittrich@...mail.com> To: john-users@...ts.openwall.com Subject: Re: Incremental attack properties questions On 01/06/2013 04:06 AM, magnum wrote: > On 5 Jan, 2013, at 13:00 , Frank Dittrich <frank_dittrich@...mail.com> wrote: >> Even if you would get incremental mode working with non-ascii >> characters, the incremental mode would sooner or later generate byte >> sequences which are not valid utf-8 characters. >> (This shouldn't happen with Markov mode, provided you generate your >> custom stats file with valid input. There's just one exception if a byte >> sequence for a non-ascii character at the end of the word gets cut off >> due to maximum length or maximum Markov level limits.) > > I really had no idea Markov is this good with UTF-8. This is cool stuff. As long as you don't have any characters which require more than 2 bytes for UTF-8 encoding, Markov works really good, except for cutting off byte sequences composing a single character at the end of the word. If you add 3-byte characters into the mix, things get worse, because then you have sequences of continuation bytes in the range 0x80-0xbf. As long as there are not too many 3-byte or 4-byte characters in your input, the number of invalid UFT-8 words generated will not be too bad. (Once you finish the UTF-8 validity check for --markov mode used together with --encoding=utf-8, --markov mode will be an almost perfect fit for UTF-8 passwords.) Frank
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.