Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4E296A7C.1020905@bredband.net>
Date: Fri, 22 Jul 2011 14:18:04 +0200
From: magnum <rawsmooth@...dband.net>
To: john-dev@...ts.openwall.com
Subject: Re: ISO-8859-1 casing (experimental)

On 2011-07-19 23:30, JimF wrote:
>> Am 18.07.2011 23:39, schrieb jfoug:
>>> I created a new option (--iso-8859-1). This is just a boolean flag
>>> (like the --utf8 switch).
>>
>> At this point, shouldn't this be converted into an enum,
>> allowing future enhancements?
>>
>> Frank
>
> This is a VERY good suggestion. There is a very limited set of flag
> bits, but we may want to expand this to additional charsets at some
> time. That is why this is only a 'john-dev' posting, as I was hoping for
> others input such as this.
>
> We may still want a flag bit, telling john that there 'is' a charset
> change. That way, we have a simply quick flag check to know that we are
> in default '7-bit' mode, and do not slow anything down. However, if the
> flag is set, we can take the additional step to see what charset we need
> (when we are in the proper section of code).

I'm checking this patch out now.

+       {"iso-8859-1", FLG_ISO5998_1, FLG_ISO5998_1},//, 
FLG_WORDLIST_CHK | FLG_TEST_CHK},

Why is it called ISO5998_1?


+       if (options.flags & FLG_ISO5998_1) {
+               conv_tolower = rules_init_conv(CHARS_UPPER 
CHARS_UPPER_8859_1, CHARS_LOWER CHARS_LOWER_8859_1);
+               conv_toupper = rules_init_conv(CHARS_LOWER 
CHARS_LOWER_8859_1, CHARS_UPPER CHARS_UPPER_8859_1);
+       } else {
+               conv_tolower = rules_init_conv(CHARS_UPPER, CHARS_LOWER);
+               conv_toupper = rules_init_conv(CHARS_LOWER, CHARS_UPPER);
+       }

If using an enum instead, things like the above would be a switch 
statement, right?

The whole thing looks pleasantly simple. I like it! It could be improved 
to affect all uc/lc stuff within formats (I think LM, NETLM, NETHALFLM 
and sapB are affected). These could use a shared --charset aware uc (or 
lc) function. Once they do, adding a codepage to that shared function 
would affect all applicable formats. But we'd have to beware of 
performance drops.

magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.