|
Message-ID: <CAJ9ii1GijmknTkRMWBNXLAPWE+xC4uiwZZKecoKRg=ayaqvEVA@mail.gmail.com> Date: Sun, 17 Nov 2024 18:20:51 -0500 From: Matt Weir <cweir@...edu> To: john-users@...ts.openwall.com Subject: Re: Markov phrases in john I just published a blog post comparing Tokenizer against other attack types. Link: https://reusablesec.blogspot.com/2024/11/analyzing-jtrs-tokenizer-attack-round-1.html As a disclaimer, due to falling down a number of "non-tokenizer related" rabbit holes as well as only being able to work on this work in short bursts, I started writing this blog entry a couple of weeks ago, and I didn't want to pivot and lose even more time. So all the tests utilize the original version of tokenizer and don't include the improvements discussed since then. Still I hope this research is helpful! The short summary of the results are: - Tokenizer performs better than Incremental mode in the first 5 billion guesses - OMEN performs better than Tokenizer in the first 5 billion guesses. But OMEN has a number of implementation challenges where an Incremental based attack can still be more practical. - When trying to simulate multi-stage cracking attacks I really need a better way to record much longer cracking sessions (aka trillions of guesses). While Tokenizer appears to be a respectable attack to run after running a large rules/wordlist attack, the fact that I only ran it for 5 billion guesses didn't make the results trustworthy/statistically-significant. Basically the result is the test needs to be redesigned vs. learning much about the actual attacks ;p Cheers, Matt/Lakiw On Thu, Oct 31, 2024 at 10:15 PM Solar Designer <solar@...nwall.com> wrote: > On Fri, Nov 01, 2024 at 12:19:00AM +0100, Solar Designer wrote: > > On Thu, Oct 31, 2024 at 11:36:07PM +0100, Solar Designer wrote: > > > What's more interesting, though, is that it's a way to get different > > > passwords cracked. For example, with token length forced to 4 (for all > > > 158 tokens, many of which are full words or years), training on RockYou > > > without dupes, at 1 billion candidates I got 1770275 or +670876. > > > Combining this with the above result of "1870645 or +771246" (which was > > > for token lengths 2 to 4), I get 2123847 or +1024448. That's for 1+1=2 > > > billion candidates total. Simply continuing the first (token length 2 > > > to 4) run to 2 billion instead gives merely 2016222 or +916823. > > > > > > So we get 12% more combined incremental mode cracks by splitting the 2 > > > billion candidate budget into two differently tokenized 1 billion runs. > > > > I was also interested in how wasteful or not such split is in terms of > > duplicate candidates. > > > > For the token length 2 to 4 run, we have 997250925 unique (99.7%). > > For the token length 4 run, we have 998700856 unique (99.9%). > > For these two combined, we have 1885325771 unique (94.3%). > > > > So it's only moderately wasteful (and for such counts it's practical to > > deduplicate when hashes are slow), but could get worse for longer runs. > > Upon a closer look, I realize that the token length 4 run is actually a > mix of lots of token-less passwords and also many with tokens. So it's > an interesting and useful result, but it's not what it seemed at first - > not so much of a focus on longer passwords in the second billion. > > To actually focus on longer passwords, I just processed the length 4 > token fake pot file through: > > sed -n '/[^ -~]/p' > > This leaves only lines with non-ASCII characters, which is what we use > for tokens. Then the corresponding 1 billion run cracks only +378031, > but the ratio of longer passwords increases (359 are length 13+, up from > 124 before the above sed). Combined with the token length 2 to 4 run, > it's 2018976 or +919577, which is still slightly higher than a 2 billion > run for token length 2 to 4. > > To fully exclude token-less passwords from this second run, I modified > the external mode: > > - word[k] = 0; > + > + if (i == k) > + word = 0; > + else > + word[k] = 0; > > (This filters out candidate passwords for which the length was left > unchanged by token substitution, which means they had no tokens.) > > Then it cracks only +156803, which obviously leaves it behind a simple 2 > billion run for token length 2 to 4. The number of cracked length 13+ > passwords increases only a bit further (387, up from 359 above). First > 25 candidates from this run are: > > master1 > malove > minnie1 > melove > jameslove > jolove > samanda > sweetygirl > sweets1 > ming1234 > ma1234 > me1234 > james1234 > jo1234 > masters > mara123 > minnie2 > sweety1 > sweets3 > may1234 > miamor1 > miamore > sara123 > sweetgirl1 > sweetgirl9 > > Length 16+ cracked are: > > mariannamarianna > ilovemyfamily123 > lovelovelove1994 > angelinaangelina > alexalexalex2007 > bellababygirl2007 > sexygurl4eva1992 > cherryberry2cute > 1989198919891989 > danceamandadance > bearbearbearbear > moneyoverbitches1 > 2005200520052005 > 0000000000002008 > ilovestephanie11 > > (Those mostly with repetitions should of course also be crackable with > wordlist+rules.) > > Modifying the external mode to insist on at least 2 tokens (length > increase greater than 4) results in the below first 25 candidates: > > jameslove > sweetygirl > james1234 > sweetgirl1 > sweetgirl9 > amberlove > amber1234 > jamesbaby > amberbaby > moneylove > money1234 > moneybaby > jerry1234 > jerrybaby > jerrygirl > sweetgirl2006 > sweetlove1 > sweetlove4 > sweetlover > moneygirl > jamesgirl > jerryange > ambergirl > sweetlove > sweet1234 > > This gets closer to "Markov phrases", although words longer than 4 are > formed from the tokens plus individual letters. Unfortunately, this > cracks only +14391 in 1 billion, out of which 427 are length 13+ (still > an increase compared to previous runs). It may be worth retesting this > kind of filtering with shorter tokens, as I guess at length 4 the low > number of available tokens becomes too much of a limiting factor for > which passwords may be formed. > > Besides longer passwords still being relatively rare and these attacks > not being as effective as wordlist+rules at cracking them, yet another > factor is that longer passwords - and especially non-wordlist-crackable > ones - may be under-represented in HIBP compared to real-world usage. > That's because HIBP is compiled largely from previously-cracked > passwords (many of them from a long ago), not only from plaintext leaks. > So whatever passwords others couldn't crack before are simply not in > there, unless the specific leak was plaintext. In this context, Matt's > suggested testing "against a site specific password dump" makes even > more sense. > > Alexander >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.