![]() |
|
Message-ID: <20250331000317.GA2293@openwall.com> Date: Mon, 31 Mar 2025 02:03:17 +0200 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Issue Applying Rules to Tokenized in John the Ripper On Thu, Mar 27, 2025 at 04:07:42AM +0100, Solar Designer wrote: > On Thu, Mar 27, 2025 at 03:30:48AM +0100, Solar Designer wrote: > > The generated password candidates are different and their number is also > > different (152 original vs. 124 when rules are applied to tokenized > > wordlist prior to --external=Untokenize). That's the point of my idea > > number 13, so thank you for making me try it out. > > To more fully test my idea, we need to see whether and how many > different candidate passwords the rules+Untokenize run adds on top of a > simple rules run. > > In the above tests, the simple run produces 152 unique candidates. > They're unique due to our dupe suppressor, as otherwise Best64 would > tend to produce lots of dupes. The rules+Untokenize run produces 124, > but the output from this run has 125 lines out of which 123 are unique. > There are 3 instances of the empty line. I'm actually puzzled by that > (we could want to investigate it in case it's a bug). > > Anyway, combining those 152 and 123, I get 165 unique. So, yes, this > weird trick does add 13 unique candidate passwords. They are: > > TAB > 123123123 > 123123123123 > 123123123123123 > 123123123123123123 > 123123123123123123123123 > 23 > abcabcabc > abcabcabcabc > abcabcabcabcabc > abcabcabcabcabcabc > abcabcabcabcabcabcabcabc > bc > > where TAB is the control character (which puzzles me a bit). I investigated the puzzling 3 instances of the empty line and TAB. No bug there. It's just how the best64 rules work, especially hashcat's "+" command, which increments the ASCII code. (This ruleset was meant for hashcat, and we run it in our hashcat compatibility mode.) When applied to tokens, which are themselves non-printable characters, this may produce other non-printable characters, including controls. In this tiny test case, we only have token codes 1 to 6: mod[1] = 0x333231; // "123" 3 mod[2] = 0x636261; // "abc" 3 mod[3] = 0x3332; // "23" 2 mod[4] = 0x6362; // "bc" 2 mod[5] = 0x3231; // "12" 2 mod[6] = 0x6261; // "ab" 2 A few increments of these bring them to TAB (ASCII 9) and LF (ASCII 10). Since these are higher than 6, they're not further modified by --external=Untokenize - there's no string to replace them "back" to. When the LF character is printed, it becomes two LFs at once - one is LF itself and the other is LF added after this line - so two empty lines. Some other rules result in a proper empty string, which the suppressor includes only once, but it's distinct from the LF string. So we get 3 empty lines in total. Also, one of them is correctly not counted towards the number of candidate passwords since it's inside a candidate. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.