![]() |
|
Message-ID: <CAGSLPCapWcC+Zq6+gFX_C2P4gT3UDeSuqu632Y4S7=VFM5ZE-A@mail.gmail.com> Date: Thu, 27 Mar 2025 14:53:08 +0530 From: Pentester LAB <pentesterlab3@...il.com> To: john-users@...ts.openwall.com Subject: Re: Issue Applying Rules to Tokenized in John the Ripper Thank you for your detailed response and for clarifying the correct approach to applying rules to a tokenized wordlist in JtR. Your explanation helped me understand how the tokenizer is intended to be used and why my previous attempt was incorrect. I appreciate the step-by-step breakdown and the examples you provided. I'll go through them carefully and experiment with the tokenizer along with the suggested approaches. If I have any clarifications in the future or further doubts, I will reach out and ask. Thanks again for your guidance and for taking the time to explain this in detail! On Thu, Mar 27, 2025 at 8:38 AM Solar Designer <solar@...nwall.com> wrote: > A correction inline, and addition below: > > On Thu, Mar 27, 2025 at 03:30:48AM +0100, Solar Designer wrote: > > Trying to repair your weird attempts above using unmodified tokenize.pl: > > > > $ cat test.txt > > abc > > @ > > 123 > > $ perl tokenize.pl test.txt > john-local.conf > > $ sed '/[^ -~]/d; s/123/\x1/g; s/abc/\x2/g; s/ab/\x3/g; s/23/\x4/g; > s/bc/\x5/g; s/12/\x6/g' test.txt > test-tokenized.txt > > $ ./john --wordlist=test-tokenized.txt --external=Untokenize --stdout > > Using default input encoding: UTF-8 > > abc > > @ > > 123 > > 3p 0:00:00:00 100.00% (2025-03-27 03:01) 60.00p/s 123 > > $ ./john --wordlist=test-tokenized.txt --rules=Best64 > --external=Untokenize --stdout | head > > Using default input encoding: UTF-8 > > Press 'q' or Ctrl-C to abort, 'h' for help, almost any other key for > status > > Enabling duplicate candidate password suppressor using 256 MiB > > 124p 0:00:00:00 100.00% (2025-03-27 03:01) 1033p/s 123123123123123123 > > abc > > @ > > 123 > > abc0 > > @0 > > 1230 > > abc1 > > @1 > > 1231 > > abc2 > > $ wc test.txt test-tokenized.txt > > 3 3 10 test.txt > > 3 1 6 test-tokenized.txt > > > > Where I took the "sed" command from the generated john-local.conf, but > > removed the final part where it had "; s/^/:/" as that part was there > > for producing fake pot files (for incremental mode training) rather than > > wordlists. > > > > As you can see, --external=Untokenize was able to correctly restore the > > wordlist from its tokenized or compressed form (original test.txt was 10 > > bytes, but tokenized test-tokenized.txt only 6 bytes). And the rules > > are applied if you request them. > > > > Moreover, you can see that they're applied differently and their effect > > is different than if you used the same rules on the original wordlist: > > > > $ ./john --wordlist=test.txt --rules=Best64 --external=Untokenize > --stdout | head > > Using default input encoding: UTF-8 > > Press 'q' or Ctrl-C to abort, 'h' for help, almost any other key for > status > > Enabling duplicate candidate password suppressor using 256 MiB > > 152p 0:00:00:00 100.00% (2025-03-27 03:07) 1013p/s 123121 > > abc > > @ > > 123 > > cba > > 321 > > ABC > > Abc > > abc0 > > @0 > > 1230 > > Oops, I forgot to remove --external=Untokenize from the above command > line. Luckily, it didn't affect anything this time because test.txt has > no token codes in it. But the correct command here would be simply: > > $ ./john --wordlist=test.txt --rules=Best64 --stdout | head > Using default input encoding: UTF-8 > Press 'q' or Ctrl-C to abort, 'h' for help, almost any other key for status > Enabling duplicate candidate password suppressor using 256 MiB > 152p 0:00:00:00 100.00% (2025-03-27 03:46) 1688p/s 123121 > abc > @ > 123 > cba > 321 > ABC > Abc > abc0 > @0 > 1230 > > > The generated password candidates are different and their number is also > > different (152 original vs. 124 when rules are applied to tokenized > > wordlist prior to --external=Untokenize). That's the point of my idea > > number 13, so thank you for making me try it out. > > To more fully test my idea, we need to see whether and how many > different candidate passwords the rules+Untokenize run adds on top of a > simple rules run. > > In the above tests, the simple run produces 152 unique candidates. > They're unique due to our dupe suppressor, as otherwise Best64 would > tend to produce lots of dupes. The rules+Untokenize run produces 124, > but the output from this run has 125 lines out of which 123 are unique. > There are 3 instances of the empty line. I'm actually puzzled by that > (we could want to investigate it in case it's a bug). > > Anyway, combining those 152 and 123, I get 165 unique. So, yes, this > weird trick does add 13 unique candidate passwords. They are: > > TAB > 123123123 > 123123123123 > 123123123123123 > 123123123123123123 > 123123123123123123123123 > 23 > abcabcabc > abcabcabcabc > abcabcabcabcabc > abcabcabcabcabcabc > abcabcabcabcabcabcabcabc > bc > > where TAB is the control character (which puzzles me a bit). > > Alexander >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.