|
Message-ID: <20050623005340.GA15805@openwall.com> Date: Thu, 23 Jun 2005 04:53:40 +0400 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Re: Dupes recognition based on internal representation of ciphertext? I wrote: > >Arguably, the loader should be enhanced to also use internal > >representations when it avoids loading dupes(*) for cracking and when > >it displays cracked passwords. On Wed, Jun 15, 2005 at 08:38:30AM +0200, Frank Dittrich wrote: > there were no dupes in my sample password file. > Even if the internal hash representation or the canonical form > is the same, the "user names" differ. I was not referring to your specific problem. I was pointing out that dupes handling in John needs to be made more consistent. > Can you make the conversion of external hash representations into a > canonical form a (compile time) option? No, -- the John "core" operates on internal representation of hashes anyway. While I could make the dupes detection optionally use the external (non-canonicalized) representation in all places (loader, logger, etc.), I do not think that having this as an option is a good idea. It would be very confusing to most John users. Even those few who would think they know what this stuff is about would in fact likely be wrong about some aspects of it. This is one of the reasons why I've suggested a different long-term fix in my previous response to you. The idea is to have "john --show" compare internal representations, and only fall back to simple string comparisons for hash types that are not supported by the version of John being used. Nevertheless, I will comment on your proposed approach: > Or, even better, a setting which depends on the ciphertext format? > This could be done by adding a new function pointer to fmt_main. We've got split() already. This function may do more than just split LM-like hashes into their halves, it may also canonicalize ASCII representations of any hash types for their storage in john.pot. > BTW, theoretically, the problem of multiple external representations > for an internal representation can also occur for the salts. It does occur in practice. Even worse, for the traditional DES-based crypt(3) hashes, the way invalid salt characters are treated is implementation-specific. I am aware of two kinds of implementations (that is, two different mappings of invalid salt characters onto the 6-bit values); John supports only one of those (and enhancing it to support both is not trivial). John does not canonicalize the salts it stores into john.pot in any way, but it does use internal representation when determining whether two salts are different or the same. So it will never waste CPU cycles hashing candidate passwords against two different representations of the same salt, and it will never load more than 4096 salts for traditional DES-based crypt(3) hashes. But the loader (and "john --show") might not recognize that a password hash has already been cracked if the instance stored in john.pot has another representation of the same salt. Of course, these occurrences (two hashes of the same plaintext password, with the same salt, but with different representations of the salt) are very rare. > To take care of this possibility, the newly introduced function > should convert salt+hash into a canonical form. It'd be tricky to implement, but split() could do it. In fact, it could even produce multiple canonicalized hashes to cover the implementation differences I've mentioned above. > This could be a representation which corresponds to the rules > checked by the ciphertext-specific valid() function FWIW, right now, both pre-split() and post-split() strings are supposed to be valid(). > Or, one which inserts a format-specific marker, to reduce the risk of > collisions among different ciphertext formats (and allow for easier > grepping john.pot). Yes, split() does just that for LM hashes already. > No matter how the conversion into a canonical representation works, > older john.pot entries for hash algorithms with multiple external > representations probably need to be converted into their canonical > representation, I disagree. > otherwise this conversion would have to be done > each time you load john.pot to check for cracked password hashes. Old entries could continue to be compared in the old-fashioned way. Also, the conversion should be cheap enough even if we would do it for all entries on the fly. > Another symbolic link to john? > Some options to restrict the conversion to a particular format and > some sanity checks (call valid(), and double-check by re-computing the > password hash before converting the john.pot entry) would be good. That's way too much complexity for the user. > >Alternatively, the split() method for affected hash types should be > >enhanced to canonicalize the text representations. > > I'm not sure whether this can be easily done for each password hash > algorithm. There might exist some for which it would not be easy, but then it would also be not easy to do within a new function. > >>Of course, for raw MD5 the problem can be avoided by just > >>translating all hashes to lower case. > > > >BTW, this is best done in split(). > > I thought of converting the password file instead. I had guessed that, -- I've just suggested a better way to do it. > >Meanwhile, you can use the trivial fix to cracker.c if you like: in > >the log_guess() function call, remove the "dupe ? NULL : ". > > That's exactly what I did, but I wasn't sure whether this change > would affect any other hash algorithm. It does. But the only impact is the potential for duplicate entries in john.pot, so that's OK for a private use hack. -- Alexander Peslyak <solar at openwall.com> GPG key ID: B35D3598 fp: 6429 0D7E F130 C13E C929 6447 73C3 A290 B35D 3598 http://www.openwall.com - bringing security into open computing environments Was I helpful? Please give your feedback here: http://rate.affero.net/solar
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.