john-users - Re: Markov phrases in john

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241118050424.GA10334@openwall.com>
Date: Mon, 18 Nov 2024 06:04:24 +0100
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Markov phrases in john

On Sun, Nov 17, 2024 at 06:20:51PM -0500, Matt Weir wrote:
> I just published a blog post comparing Tokenizer against other attack
> types. Link:
> https://reusablesec.blogspot.com/2024/11/analyzing-jtrs-tokenizer-attack-round-1.html
> 
> As a disclaimer, due to falling down a number of "non-tokenizer related"
> rabbit holes as well as only being able to work on this work in short
> bursts, I started writing this blog entry a couple of weeks ago, and I
> didn't want to pivot and lose even more time. So all the tests utilize the
> original version of tokenizer and don't include the improvements
> discussed since then. Still I hope this research is helpful!

Thank you for running these tests!

I only skimmed.  One thing that surprised me is that your top 25 for
training on RockYou Full (including dupes, right?) is different from
what I had posted in here at all (even if similar).  Why would that be,
if you say you use the original version of the script and it seems to
still have been latest when I posted my top 25 on October 30.

> The short summary of the results are:
> - Tokenizer performs better than Incremental mode in the first 5 billion
> guesses
> - OMEN performs better than Tokenizer in the first 5 billion guesses. But
> OMEN has a number of implementation challenges where an Incremental based
> attack can still be more practical.

That's interesting.  I thought you'd also try the tokenizer along with
OMEN - is that somehow difficult to do?

> - When trying to simulate multi-stage cracking attacks I really need a
> better way to record much longer cracking sessions (aka trillions of
> guesses). While Tokenizer appears to be a respectable attack to run after
> running a large rules/wordlist attack, the fact that I only ran it for 5
> billion guesses didn't make the results
> trustworthy/statistically-significant. Basically the result is the test
> needs to be redesigned vs. learning much about the actual attacks ;p

As I understand, the reason you say 5 billion is not enough in this test
is that it cracked only a tiny fraction of total passwords after your
prior wordlist+rules run had cracked a much larger fraction and probably
produced many more candidates.  And by "recording" you mean keeping
track of the number of passwords cracked at different candidate counts.

If so, maybe these options would help:

--max-candidates=[-]N      Gracefully exit after this many candidates tried.
--[no-]crack-status        Emit a status line whenever a password is cracked
--progress-every=N         Emit a status line every N seconds

along with --verbosity=1 and cracking of hashes with JtR itself (not by
your external tool).  Also see the AutoStatus external mode, although to
use it in this case you'd need to roll its logic into Untokenize.

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.