Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e4fd48bd84f7c45907c327c942e41899@smtp.hushmail.com>
Date: Sun, 4 Nov 2012 08:31:01 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Croatian charset support (was: KoreLogic rules on Openwall)

On 3 Nov, 2012, at 11:23 , Vlatko Kosturjak <kost@...ux.hr> wrote:

> On 11/02/2012 11:49 AM, magnum wrote:
>>>> I have also made localized/Croatian version of the rules (only parts
>>>> which are relevant):
>>>> https://github.com/kost/jtr-stuff/blob/master/rules/rules-kosthrv.txt
>>> Cool. On that subject, would you like me to add some codepage support? It's extremely easy to do so with the toolchain we made when adding the codepage support. What would be right for Croatian? CP852 and ISO-8859-2 perhaps?
>> I found this: http://luki.sdf-eu.org/txt/cs-encodings-faq.html
>> So I suppose CP852, CP1250 and ISO-8859-2? Having these in place will make the rules engine able to, for example, upper/lowercase Croatian non-ascii letters. The UTF-8 support does not include the Rules engine when it comes to single letter manipulations.
> 
> Thanks magnum!
> 
> Yes, it's CP852, CP1250, ISO-8859-2 and of course UTF-8.
> Also, does it support stripping of special characters, so for example š
> becomes s, č becomes c, etc..?
> Also, I'm interested - in what charset I should write rules? So, they
> can be automatically converted?

That kind of stripping is not implemented (we could possibly add it though - do you think it is needed?). Support for a codepage means that JtR will recognize the 8-bit characters as vowels, consonants or specials and it will case shift them correctly, not only in rules but also automatically for fixed case formats like LM. And most importantly it will convert them correctly to UTF-16 when needed, most notably in the NT format.

If you use 8-bit substrings in rules, you need to store the rules file in the same codepage as your wordlists. For UTF-16 formats you can use wordlists in any supported encoding provided you tell JtR which one it is. For 8-bit formats no conversion is performed so you need to use a wordlist with the same encoding as the target hashes was made from: If you attack Croatian LM hashes you will need to use CP852.

magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.