|
Message-ID: <025201cc4c91$8a280830$9e781890$@net> Date: Wed, 27 Jul 2011 14:15:25 -0500 From: "jfoug" <jfoug@....net> To: <john-dev@...ts.openwall.com> Subject: RE: Found cp1251 issue (and likely 8850-1) or many code pages. >From: Frank Dittrich [mailto:frank_dittrich@...mail.com] > >Am 27.07.2011 19:02, schrieb jfoug: >> The behavior I am working towards, is that when we upcase a string >> with B5 in it (for cp1251/8859-1), that there will be a xB5 left in >> the upcased string in the end. > >Hi Jim, > >I think this behavior is correct for all hash algorithms that have been >"invented" prior to unicode. So, it is likely that we may have to have some logic/data into the fmt_main structure (such as flags or something), that would give hints to the code running in Unicode.c on exactly how to proceed forward. Thus, Unicode.c could quickly check and know how to proceed. If the format expects to see xDF -> SS (current Unicode logic I could find), then that would be done. Or other formats may have other specifics that would change the behavior done in john's Unicode.c. I am glad to hear that leaving lower case chars that do not have matching upper case (even if Unicode DOES), alone, is the right choice. I built the uc() / lc() to work that way, so we should be good. Once a string is converted into Unicode in john, and then acted upon (with Unicode functions), it does not use code page logic. The only way it 'would', is if we then convert back into code page. However, the invalid characters would not convert properly (just like in perl), so it is simply something that people need to keep in mind, when modifying john. All manipulation logic will need to be done in CP, then if the string needs to be in Unicode to proceed, the very last step is CP -> Unicode. Rules already would require this, since it is 8 bit. I have 8 bit casing working properly for each CP, so formats that want up/down casing can do this in CP prior to converting to UTF16, if the format needs that. Very good information. We just have to be careful, and implement that way (and document the proper way), and then when we DO find anomalies (the xDF 'may' be one of these), that we find out exactly WHAT the original password hash code did, and make sure we do the same. Even after this first release, I am SURE there will be changes required, because assumptions which have been made, turn out to be wrong. However, the ground work is solidly done, and it will be easy to modify behavior, once it is fully know that any assumption was not correct for the actual hash in the wild. Jim.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.