|
Message-ID: <472408d6a850a7d84c2a08e0c3d87bdb@smtp.hushmail.com> Date: Mon, 25 May 2015 10:32:07 +0200 From: magnum <john.magnum@...hmail.com> To: john-users@...ts.openwall.com Subject: Re: Bleeding jumbo now defaults to UTF-8 On 2015-05-24 06:17, Solar Designer wrote: > On Fri, May 22, 2015 at 06:33:42PM +0200, magnum wrote: >> On 2015-05-22 16:48, Marek Wrzosek wrote: >>> That's a great news! What is the simplest way to "repair" all.lst from >>> Openwall? >> >> I bet it's a mix of encodings so can't simply be converted. > > Yes. And maybe it should stay as a mix of encodings despite of magnum's > change, because quite often multiple encodings may possibly have been > used in target passwords. Yes, it might be relevant to keep one copy like that. Rockyou shows a real-world case where most of the hashes were UTF-8 but some were ISO-8859-1/CP1252 and a few were something else. > I am worried that some lines are not valid UTF-8, though. If used with the new defaults, a warning will be emitted and conversion will be truncated whenever 8-bit non-UTF8 is seen ("Möller" in 8859-1 will become "M"). > How do we ensure those are tested against the hashes > verbatim, like core (non-jumbo) JtR would test them? Will this just > happen that way despite of the recent change of default in jumbo? If running with --enc=raw, the warnings will not be emitted and it will behave just like non-jumbo (at least in this regard). This is actually just an alias for --enc=ascii but the latter name might be confusing for this use. > magnum, what do you suggest we do? Simply assuming that e.g. md5crypt > hashes are likely of UTF-8 plaintexts won't do. Some of them might be, > but some older ones might be iso-8859-1 or koi8-r or windows-1251 as > well. That's why current all.lst mixes all of these encodings together. You would either run a mixed-codepage wordlist with --enc=raw (but just like core john, you won't get eg. case-flipping of 8-bit characters. Also, note that while this may be sensable for md5crypt, it isn't for NT, or any other hash that use UTF-16 internally). Or you'd use UTF-8 wordlist(s) (perhaps some of the non-"all" ones) and specify a target encoding. This will work for NT et al too. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.