|
Message-ID: <BLU159-W51E89A5A815762A846AC77A4930@phx.gbl> Date: Sat, 31 Dec 2011 17:57:38 +0000 From: Alex Sicamiotis <alekshs@...mail.com> To: <john-users@...ts.openwall.com> Subject: RE: Rules for realistic words Interesting...I saw that the newer versions of John included such an option but I never tried it or googled it... Thanks. > Date: Sat, 31 Dec 2011 10:24:12 -0600 > From: tansey@...utexas.edu > To: john-users@...ts.openwall.com > Subject: Re: [john-users] Rules for realistic words > > Hi Alex, > > Have you seen Markov mode? > > http://openwall.info/wiki/john/markov > > That seems to be more or less what you are describing in the first half of > your email. > > Wesley > > 2011/12/31 Alex Sicamiotis <alekshs@...mail.com> > > > > > From an analysis I've conducted in a file containing greeklish (greek > > words written in english) and english passwords, of ~1500 DES (max 8 > > length) passwords, the following came up: > > > > Very high frequency letters: > > a=850 i=597 o=584 e=525 > > > > Medium to low frequency letters > > s=498 r=472 n=418 t=405 l=366 m=277 p=247 c=211 d=201 k=193 g=159 u=148 > > h=144 b=113 y=97 f=87 > > > > Very low frequency letters > > v=66 w=53 x=47 j=31 z=31 q=18 > > > > > > Number frequency: > > 1=448 occurences > > 2=293 occurences > > 3=249 occurences > > 9=219 occurences > > 0=203 occurences > > 4=185 occurences > > 6=175 occurences > > 5=174 occurences > > 7=156 occurences > > 8=132 occurences > > > > ...what this means, is that a new method of brute forcing could be used. > > > > Currently it's something like > > > > 1) single > > 2) dictionary > > 3) dictionary with rules > > 4) incremental with digits, Alpha, Lanman, All from lower characters to > > more characters. > > > > Now for the 26 letters of Alpha, it goes like 26x26x26x26x26x26x26x26 = > > 208.8 billion combos > > For the Alpha+Digits it goes 36x36x36x36x36x36x36x36 = 2.82 trillion combos > > > > What if there were intermediate character sets of frequently used letters > > as an intermediate step between dictionaries with rules and incremental > > with full character sets? For example the top 16 letters and 4 numbers = 20 > > characters in total. In such a case it's only 25.6 billion combos for 8 > > char length - and with multiple hashes, it's always worth to check these > > first in order to crack them and speed up the rest. I think incremental > > mode already applies some sort of "more frequent" type of cracking, but I > > don't know how optimized it is in relation to this. If it already covers > > this sector, ignore this comment. > > > > Another aspect that can take improvement, (not in cracking speed, but in > > cracking the easier ones out) is to emulate how language is constructed. > > For example greek & italian languages, use a lot of alternation between > > consonant and vowels. This means that you can have a rule which goes like > > this: > > > > (V)owel > > (C)onsonant > > (B)oth+numbers+symbols > > > > 1-4 lengths are cracked in incremental > > From 4 char length onwards: > > > > VCVCV => italy > > CVCVC => begar > > VCVCB => nike@ > > CVCVB => epic6 > > VCVCVC > > CVCVCV > > VCVCVB > > CVCVCB > > VCVCVCV > > CVCVCVC > > VCVCVCB > > CVCVCVB > > VCVCVCVC > > CVCVCVCV > > VCVCVCVB > > CVCVCVCB > > > > By splicing words in human-like syllables, I achieved a hefty increase in > > effective cracking speed. Because instead of 26x26x26... it goes like > > 18x8x18x8x18 - which means enormously less combinations than non-words like > > zzxaeseq. > > > > (the following is a greeklish example - you may see some words as vowels > > which are consonants in english, but in greeklish for example w is used > > phonetically as o.. it's the omega letter) > > > > > > [bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy]" > > > > [aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz]" > > > > [bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz]" > > > > [aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy]" > > > > In some cases it needs tweeking to account for two consonants or two > > vowels in some part of the word (for example peNTagon, aCRopolis, bicyCLe, > > AErodynamic), so a few variations of the above are necessary to cover a > > large percentage of words. > > > > An analysis of the english language and linguistic patterns might give > > significant increase in human-like words or composite words (that the > > dictionaries do not contain - like name&surname). Ideally, we could have a > > statistics program or an AI program to extract rules for the 95%+ of the > > words contained in a certain language, so that combinations could be based > > on this structure (with possible twists like adding stuff in the end). > > English are a bit more difficult to do in a letter-by-letter format > > compared to greek/italian, but, ultimately, it's just more variations. A > > syllable approach (ie combos of one, two and three letter sequences) might > > also be appropriate for english or other languages. For example instead of > > combining words, we could combine ready syllables... The syllable MO + > > syllable RE = word MORE. The combinations compared to 26^8 will drop > > dramatically. > > > > Have a great 2012... > >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.