|
Message-ID: <4E1F3EA5.9030709@bredband.net> Date: Thu, 14 Jul 2011 21:08:21 +0200 From: magnum <rawsmooth@...dband.net> To: john-dev@...ts.openwall.com Subject: Re: Upper casing (and lower casing) in john On 2011-07-14 17:18, JimF wrote: > 1. rules: l u c C ?l ?u t TN (p P I may also be impacted). S V are also > likely candidates. Sometimes you really only want a-z (see 2. below) so for ANSI mode, I suggest we keep all the existing as-is and add alternate versions for some or all of them that use the new functions. In UTF-8 mode, we could add support for (fully) case-shifting whole words but as soon as we try to say "third character" or some such, rules are not UTF-8 aware. I have some vague thoughts about how to add future UTF-8 awareness in rules (counting multibyte characters as one) but that is probably far away - and it will be much slower than today so it must be separated so it doesn't hit non-UTF8 mode. > 2. Formats (but these are one by one issues which need to be addressed > directly). Oracle/mssql have been handled. LM has not, but by my > understanding, what we have done already is the 'correct' method. LM is special because it does not use iso-8859-x but the "OEM codepage", often cp437. We could easily add a special, complete, uc() function for that but then again the hashes may come from a Greek or French or Russian Windows, using cp737, cp852, cp866 or something else instead of cp437, and then our uc() for cp437 would just mess things up. By the way this also applies to iso-8859-1 that we are supporting now. Unix hashes may well be made from iso-8859-2 or something else, and your new uc() for ansi would just make a mess instead of caseshifting correctly. To handle this, we could add support for a number of codepages but I think we can just leave LM and other formats that doesn't use either of these encodings as-is for now. Anyway I think the new full case-shifting support for iso-8859-1 and Unicode is great. It was something that nagged me but I never looked into it close enough to realise how tiny the case-significant part of the Unicode space is. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.