|
Message-ID: <20100227185412.GA21345@openwall.com> Date: Sat, 27 Feb 2010 21:54:12 +0300 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Encoding UTF-8 .pot fails On Sat, Feb 27, 2010 at 01:07:10PM +0100, websiteaccess@...il.com wrote: > My system is OS X (latest), I run JTR 1.7.5 in my terminal (setting > with UTF-8 encoding), my wordlist is UTF-8 encoding too. ALL is UTF-8 > > At the begining of the crack session JTR store passwords found in the > .pot file encoded UTF-8. > After 1 hour of cracking (-rules), my .pot is no more UTF-8 ! You must be wrong about some of the statements you made above, but I can't guess which one(s) are wrong. Since JtR is not aware of different character encodings, it can't possibly switch from one encoding to another. To make matters worse (as it relates to the list members helping you), when you post your sample passwords to the list they might be getting recoded from one character encoding to another, maybe even more than once. It depends on many programs that you use to get the passwords into an e-mail message draft, to edit it, and to create and send the message. All of this might be transparent to you (like a copy & paste), yet many programs are involved. > --- passwords found at the begining of my session -- > bären [...] As far as I can tell, these display correctly when interpreted as iso-8859-1 (not UTF-8), but that might be an effect of the way you placed them into an e-mail message and sent it. Indeed, your e-mail message was sent in iso-8859-1 (according to its headers), so you couldn't correctly include UTF-8 characters in it (the recipients' mail readers would misinterpret those because the characters would be inconsistent with your message headers). > --- after 1 hour, all same passwords cracked before, are now unreadable -- > bären These look like UTF-8. They won't display correctly in your e-mail message for the reason I mentioned above. > I have to reencode my .pot in UTF-8 to restore all passwords correctly. What do you mean by "reencoding" your .pot? What exactly are you doing? > This problem was also with previous version of JTR. The problem has nothing to do with JtR itself. > What is the problem ? The problem is that there are too many issues involved. Character encodings is a complicated topic. I realize that this is not how you intended your question to be interpreted, but at least it's a correct answer and one that I think can actually help (albeit not directly). To debug the actual problem, I suggest that you try viewing hex dumps of your files - the wordlist, the .pot file. One thing this will tell you is that the encoding of existing entries of the .pot file obviously does not change as JtR is running (so your guess/statement that it did was wrong). It might also help you figure out where/what the problem is. You may try commands like: hexdump -C john.pot | less xxd john.pot | less od -tx1 john.pot | less (press "q" to quit the "less" viewer). You may post relevant excerpts from the hex dumps. This will avoid the uncertainty associated with possible recoding of characters when you place them in an e-mail message. One thing you could want to check is whether your terminal is still set to UTF-8 when it stops displaying john.pot contents "correctly" (the way you want). Maybe there's something that makes it switch to a different character encoding - e.g., a terminal control sequence, or a sequence of bytes that is not valid per UTF-8. Just a guess (maybe a wrong one). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.