Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55AFBBF8.3010905@gmail.com>
Date: Wed, 22 Jul 2015 17:51:20 +0200
From: Marek Wrzosek <marek.wrzosek@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: Bleeding jumbo now defaults to UTF-8

W dniu 22.07.2015 o 16:34, Marek Wrzosek pisze:
> W dniu 01.06.2015 o 22:33, Marek Wrzosek pisze:
>> W dniu 01.06.2015 o 18:04, magnum pisze:
>>> On 2015-06-01 16:47, magnum wrote:
>>>> You can do a try-catch in Perl (actual command is 'eval' iirc).
>>>> Pseudo-code:
>>>>
>>>> For each UTF-8 line of input {
>>>>      skip any pure ASCII
>>>>      try encoding to CP1234
>>>>      if it worked, print it
>>>> }
>>>>
>>>> Unless you need this a lot you shouldn't create new files (they only add
>>>> a burden of maintenance). Just write this as a simple filter  where
>>>> actual encoding would be a command-line option, and feed it to john
>>>>
>>>> Example:
>>>> $ ./john -w:all.utf8.lst -rules:whatever hashfile
>>>> $ codepage.pl <all.utf8.lst -t cp1234 | ./john -pipe -enc:cp1234
>>>> -rules:whatever hashfile
>>>> $ codepage.pl <all.utf8.lst -t cp1235 | ./john -pipe -enc:cp1235
>>>> -rules:whatever hashfile
>>>> ...
>>>>
>>>> Let me see if I can whip up an actual implementation of that filter in
>>>> Perl. I'll be back.
>>>
>>> Attached is a quick hack implementing this.
>>>
>>> magnum
>>>
>> WOW, that was fast! Thanks, magnum!
>>
> 
> And last but not least... What is the one - proper way to use --inc=utf8
> in new bleeding-jumbo? I mean, which encoding option we should use -
> --input-encoding=utf-8, --target-encoding=utf-8,
> --internal-encoding=utf-8 or just --encoding=utf-8. Because none seems
> to work in case of --inc=utf8. For --inc=latin1 --target-encoding=cp1252
> is mandatory for pot file to be utf-8 only and not mixed with other
> encodings.
> 
> Best Regards
> 
PS. Without any encoding options there are characters that are not from
utf-8. The same with --enc=raw. Is there a bug with utf8 incremental
mode after defaulting to utf-8?

-- 
Marek Wrzosek
marek.wrzosek@...il.com

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.