|
Message-ID: <20201110220621.GA14993@openwall.com> Date: Tue, 10 Nov 2020 23:06:21 +0100 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Rules characters unicode support. On Tue, Nov 10, 2020 at 05:01:24PM +0100, François wrote: > I've just finished writing the john.conf using your micro-optimization > trick. > > Three last questions before creating a pull request: Great! > 1- On my experimental file I'm working on, this rule is surprisingly > effective (hundreds of pass cracked), however, I specifically does > not have uppercase in my sample, so my john.conf change just > contains lowercase utf-8, do you want me to add uppercase? It will be most flexible to have lowercase and uppercase as two separate sections, then a section .include'ing both of those, and then have the latter .include'd from [List.Rules:Jumbo]. That way, lowercase-only can also be run by requesting just the corresponding ruleset. > 2- Correct me if I'm wrong but there are no obvious search and > replace strategy for any pattern of more than one letter in john rules > engine; I'm thinking two-letter substitution to one unicode, > specifically: > # Latin small letter thorn (th) -> þ > # Latin small letter ae -> æ There's no way to search for a two-character substring, but you can search for the first character and then check the second: /a Dp =pe Unfortunately, if the very first "a" isn't followed by an "e", this will reject the word instead of searching further. You can partially compensate for that by also having: %2a Dp =pe and so on. Of course, you'll need to follow these with commands that introduce the UTF-8 characters at position "p". Instead of the "D" command, you can have the rule calculate p+1 and check the character there, or search for the second character and then check the first at p-1 (fits the rule commands better, since adding 1 requires putting -1 into a variable first): /e vap1 =aa %2e vap1 =aa This is likely quicker when the remaining portion of the word is long. It's also better if your UTF-8 character is 2 bytes: so you just do two overstrikes. I didn't test any of these now, but they should work. > 3- Do you want me to provide the rules in a best-match order, > it might get a bit confusing, I can group by best unicode substitution > order. I have no preference, and I don't know what you mean by "best unicode substitution". I suspect these rules will usually be used as part of the jumbo ruleset, in which case their number will be relatively small and thus their order won't matter much. However, I think "best-match order" is valuable if you have that data. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.