john-users - Re: Cracking passphrases

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210727184634.GA16846@openwall.com>
Date: Tue, 27 Jul 2021 20:46:34 +0200
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Cracking passphrases

Hi David,

I'm sorry I didn't get to replying much sooner.  That said, I agree with
magnum that PRINCE mode is most relevant here.  I'll point out some
other approaches below:

On Sun, Jun 27, 2021 at 12:29:51AM -0700, David Sontheimer wrote:
> I am curious how you would use John to crack the following password
> generation heuristic:
> 
> A passphrase, limited to combinations of words from a wordlist of
> four-letter words. A passphrase may contain one to four words.
> 
> Optionally, each line of the wordlist contains one word, and the wordlist
> is limited to 1000 English words.

PRINCE mode works great for combining those words without separator
characters.  We should probably enhance it to also support separators,
but until we do (or introduce some other passphrase mode), we have to
use various hacks to combine words with separators.  Such as the Perl
scripts I had posted here:

https://www.openwall.com/lists/john-users/2006/10/19/4

Then you can use the recently added PhrasePreprocess and Phrase rulesets
to try those word combinations with varying separators, including first
without separators (as that appears to be most common).  Here they are:

# A special ruleset intended for stacking before other Phrase* rules below,
# such that you have the option to run its output through "unique" first
[List.Rules:PhrasePreprocess]
/[ ] :
-c /[ ] l Q
/[ ] @' Q
-c /[ ] @' Q M l Q

# The main optimized Phrase ruleset, almost no duplicates with proper input
[List.Rules:Phrase]
# This one rule cracks ~1050 HIBP v7 passwords per million with sequences of
# 2 to 6 words occurring 2+ times across Project Gutenberg Australia books
# when our sequence list includes them in both their original case and
# all-lowercase, as well as both with apostrophes intact and removed (these
# variations are not implemented in this ruleset not to produce duplicates)
@?w Q
# Sorted separator characters: 1_24 -.3785690@,&+*!'$/?:=#~^%;`>"[)<]|({}\
# (the apostrophe is probably overrated since it also occurs inside words)
# Each character in 1_24 cracks ~82 to ~61 passwords per million
s[ ][1_24] Q
# Leaving the space separators intact cracks ~59 passwords per million
/[ ]
# Each character in -.3785690@ cracks ~53 to ~12 passwords per million
s[ ][\-.3785690@] Q
# Each character in ,&+*!'$/?:=#~ cracks ~10 to ~1 passwords per million
s[ ][,&+*!'$/?:=#~] Q

> I'm comfortable writing an external script for generating these candidates
> and using John's --stdin option, but I'm curious if John can generate these
> hashword candidates internally with a wordlist and appropriate rules. If
> so, while all candidates would be generated either way, I'm curious if
> cracking via internal generation will be more efficient.

Yes, it will be more efficient.  As mentioned, PRINCE mode can do that,
as long as you don't need word separators.  When you do, you can also
use a hack: generate a "ruleset" from your words, so that you would have
one instance of your words in the wordlist and another in rules (each
rule appending a word).  For four words, you can e.g. have a wordlist
with groups of two words and a ruleset with groups of two words as well.
For a set of 1000 words, that's 1 million words and 1 million rules.

Also note that you can have up to two rulesets active at once, one with
"--rules" and the other with "--rules-stack", which can help with hacks
like this (e.g., to also have real rules, like the Phrase ones above).

> If using stdin, I'm curious if I need to somehow batch the candidate
> generation to parallelize the work efficiently.

I don't get your question here.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.