Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110723232325.GA16187@openwall.com>
Date: Sun, 24 Jul 2011 03:23:25 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: ISO-8859-1 casing (experimental)

Jim -

On Sat, Jul 23, 2011 at 06:08:18PM -0500, JFoug wrote:
> ----- Original Message ----- 
> From: "Solar Designer" <solar@...nwall.com>
> >
> >So I shouldn't merge your john-1.7.8-jumbo-2-iso_8859_1_tst-1, right?
> 
> Not yet.   I am getting close.  I have made numerous other fixes, and am 
> working on the last one (getting SSE working again for x86-64 builds in 
> md5-gen).   I did get a 'work around' for the md5-gen, but it was simply 
> turning off SSE for x86-64 builds.  Thus it becomes hard for someone to 
> 'know' that on that platform, they have to run certain formats with a flag 
> turning off SSE.  It makes doing md5-gen in john.conf hard to do.

How is this related to john-1.7.8-jumbo-2-iso_8859_1_tst-1?

Do you mean I should also not merge john-1.7.8-jumbo-2-md5_gen_fixes-1?
Well, it's already in the tree I'm testing now, and I am unlikely to
revert it.

> I meant that I changed the documentation, when there was mention of a 
> 'charset file', I changed it to 'incremental file'.  Often, when 
> documenting the create, you used 'charset' or 'charset file', but then 
> talked about the incremental run when documenting the --incr= mode.  I 
> simply changed most of the usage (may have missed some, but I tried to get 
> them), of the charset, to try to avoid confusion.

I don't like this change.  Maybe it's better to keep charset at its
current JtR meaning (the set of characters used by incremental mode or
the like), but use encoding to refer to encodings such as iso-8859-1.

> Is koi8-r a fixed 8 bit charset?  I think so, but am not sure.  I found the 
> wiki on it, it is fixed.  Also, it looks like D7<->F7 are up/low casing. 
> For iso-8859-1 those chars are not up/low case. Also koi8-r has a case pair 
> of A3<->B3.  So, there are only 3 'differences' in casing between 
> iso-8859-1 and koi8-r.   koi8-r cases A3<->B3, D7<->F7, DF<->FF (while 
> iso-8859-1 does not).

Yes, this is precisely what I meant.

> So, to get kio8-r into rules (for casing), would be very trivial.   Simply 
> add another enum into the --charset=X processing, add another var to the 
> options (the enum's get converted and set one of the values), then in rules 
> in the initialization code, have a switch that loads the data properly. 
> Very easy to do.

Yes.  While we're at it, let's also add cp1251.

> Likely there will be many more where we can perform 'simple' 8 bit casing 
> within the rules, if the user is presenting john with a dict file made in a 
> specific charset.

Yes, but I think that not that many actually matter.

Thanks,

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.