john-users - Re: Rules characters unicode support.

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMGgT5ABV1QD+C0WDA5G4f0G=Z8a-OKff9EnngzHYQr5BCQZvQ@mail.gmail.com>
Date: Tue, 10 Nov 2020 17:01:24 +0100
From: François <francois.pesce@...il.com>
To: "john-users@...ts.openwall.com" <john-users@...ts.openwall.com>
Subject: Re: Rules characters unicode support.

Alexander,

I've just finished writing the john.conf using your micro-optimization
trick.

Three last questions before creating a pull request:
1- On my experimental file I'm working on, this rule is surprisingly
effective (hundreds of pass cracked), however, I specifically does
not have uppercase in my sample, so my john.conf change just
contains lowercase utf-8, do you want me to add uppercase?

2- Correct me if I'm wrong but there are no obvious search and
replace strategy for any pattern of more than one letter in john rules
engine; I'm thinking two-letter substitution to one unicode,
specifically:
# Latin small letter thorn (th) -> þ
# Latin small letter ae -> æ
# Latin small letter oe -> œ
# Latin small letter eszett (ss) -> ß
I think those would be efficient additions too.

3- Do you want me to provide the rules in a best-match order,
it might get a bit confusing, I can group by best unicode substitution
order.

Francois Pesce

On Sun, Nov 8, 2020 at 8:28 PM Solar Designer <solar@...nwall.com> wrote:

> On Tue, Nov 03, 2020 at 08:16:42PM +0100, Solar Designer wrote:
> > /e Dp Ap"é"
> >
> > This is three commands: search for one character, delete the found
> > character, insert a possibly multi-character string (in our case, just
> > a multi-byte character) in the former character's place.
> >
> > You can also specify the multi-byte character via its hex codes, which
> > makes the .conf file format character set agnostic (so you can have any
> > character set active in your text editor, and it won't matter):
> >
> > /e Dp Ap"\xc3\xa9"
> >
> > However, the rules are indeed not character set agnostic - as written
> > above, the rule produces UTF-8.
> >
> > A difference from the "s" command is that the above rule will find and
> > replace only the first match, whereas "s" would find and replace all.
> >
> > You can reduce this difference by writing multiple rules like this:
> >
> > /e Dp Ap"\xc3\xa9"
> > /e Dp Ap"\xc3\xa9" /e Dp Ap"\xc3\xa9"
> > /e Dp Ap"\xc3\xa9" /e Dp Ap"\xc3\xa9" /e Dp Ap"\xc3\xa9"
> >
> > You can also choose which instances of the character you replace, e.g.
> > to replace only the second:
> >
> > %2e Dp Ap"\xc3\xa9"
>
> You can also micro-optimize these, e.g.:
>
> /e op\xa9 ip\xc3
> %2e op\xa9 ip\xc3
>
> This is overstrike and insert, which is quicker than delete and insert
> (since deleting involves shifting the rest of the string to the left).
>
> François, would you possibly create a reusable ruleset implementing
> these various substitutes and submit it via a GitHub pull request for
> inclusion in jumbo's default john.conf?  Also have it .include'ed from
> [List.Rules:Jumbo].  Thanks!
>
> Alexander
>

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.