![]() |
|
Message-ID: <CAGSLPCYHn53NPajh9dk+c=Ag+TTDdDmVsMkN0TJqCJric_+tJQ@mail.gmail.com> Date: Wed, 2 Apr 2025 00:26:06 +0530 From: Pentester LAB <pentesterlab3@...il.com> To: john-users@...ts.openwall.com Subject: Re: Issue Applying Rules to Tokenized in John the Ripper I followed the article Running JtR's Tokenizer Attack <https://reusablesec.blogspot.com/2024/10/running-jtrs-tokenizer-attack.html> and tried to generate a modified wordlist using sed and ./tokenize.pl. My original wordlist (TRAINING_PASSWORDS.txt): tamil @ king 2005 I first ran the following command: ./tokenize.pl TRAINING_PASSWORDS.txt # sed '/[^ -~]/d; s/tami/\x1/g; s/king/\x2/g; s/2005/\x3/g; s/amil/\x4/g; s/kin/\x5/g; s/ing/\x6/g; s/200/\x7/g; s/ami/\x8/g; s/005/\x9/g; s/mil/\xb/g; s/tam/\xc/g; s/in/\xe/g; s/05/\xf/g; s/ki/\x10/g; s/am/\x11/g; s/ng/\x12/g; s/mi/\x13/g; s/ta/\x14/g; s/il/\x15/g; s/20/\x16/g; s/00/\x17/g; s/^/:/' After getting the output, I used this command: cat TRAINING_PASSWORDS.txt | sed '/[^ -~]/d; s/tami/\x1/g; s/king/\x2/g; s/2005/\x3/g; s/amil/\x4/g; s/kin/\x5/g; s/ing/\x6/g; s/200/\x7/g; s/ami/\x8/g; s/005/\x9/g; s/mil/\xb/g; s/tam/\xc/g; s/in/\xe/g; s/05/\xf/g; s/ki/\x10/g; s/am/\x11/g; s/ng/\x12/g; s/mi/\x13/g; s/ta/\x14/g; s/il/\x15/g; s/20/\x16/g; s/00/\x17/g; s/^/:/' > new_training.txt However, when I checked new_training.txt, the output was incorrect: :@ Why is my sed command producing an incorrect output, and how can I fix it? On Mon, Mar 31, 2025 at 5:34 AM Solar Designer <solar@...nwall.com> wrote: > On Thu, Mar 27, 2025 at 04:07:42AM +0100, Solar Designer wrote: > > On Thu, Mar 27, 2025 at 03:30:48AM +0100, Solar Designer wrote: > > > The generated password candidates are different and their number is > also > > > different (152 original vs. 124 when rules are applied to tokenized > > > wordlist prior to --external=Untokenize). That's the point of my idea > > > number 13, so thank you for making me try it out. > > > > To more fully test my idea, we need to see whether and how many > > different candidate passwords the rules+Untokenize run adds on top of a > > simple rules run. > > > > In the above tests, the simple run produces 152 unique candidates. > > They're unique due to our dupe suppressor, as otherwise Best64 would > > tend to produce lots of dupes. The rules+Untokenize run produces 124, > > but the output from this run has 125 lines out of which 123 are unique. > > There are 3 instances of the empty line. I'm actually puzzled by that > > (we could want to investigate it in case it's a bug). > > > > Anyway, combining those 152 and 123, I get 165 unique. So, yes, this > > weird trick does add 13 unique candidate passwords. They are: > > > > TAB > > 123123123 > > 123123123123 > > 123123123123123 > > 123123123123123123 > > 123123123123123123123123 > > 23 > > abcabcabc > > abcabcabcabc > > abcabcabcabcabc > > abcabcabcabcabcabc > > abcabcabcabcabcabcabcabc > > bc > > > > where TAB is the control character (which puzzles me a bit). > > I investigated the puzzling 3 instances of the empty line and TAB. No > bug there. It's just how the best64 rules work, especially hashcat's > "+" command, which increments the ASCII code. (This ruleset was meant > for hashcat, and we run it in our hashcat compatibility mode.) When > applied to tokens, which are themselves non-printable characters, this > may produce other non-printable characters, including controls. In this > tiny test case, we only have token codes 1 to 6: > > mod[1] = 0x333231; // "123" 3 > mod[2] = 0x636261; // "abc" 3 > mod[3] = 0x3332; // "23" 2 > mod[4] = 0x6362; // "bc" 2 > mod[5] = 0x3231; // "12" 2 > mod[6] = 0x6261; // "ab" 2 > > A few increments of these bring them to TAB (ASCII 9) and LF (ASCII 10). > Since these are higher than 6, they're not further modified by > --external=Untokenize - there's no string to replace them "back" to. > > When the LF character is printed, it becomes two LFs at once - one is LF > itself and the other is LF added after this line - so two empty lines. > > Some other rules result in a proper empty string, which the suppressor > includes only once, but it's distinct from the LF string. So we get 3 > empty lines in total. Also, one of them is correctly not counted > towards the number of candidate passwords since it's inside a candidate. > > Alexander >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.