Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140403011821.Q5MOP.180540.imail@fed1rmwml303>
Date: Thu, 3 Apr 2014 1:18:21 -0400
From:  <jfoug@....net>
To: john-dev@...ts.openwall.com
Subject: Long time bug found (I hope found), in dyna

There has been a long standing bug, that I knew about, but did not know exactly why, or how to replicated. What happens, is you would run a jtr run, and find some passwords, but then run it again, and find others.  

Well, today Rich started a query with magnum and myself, about a VBulletin leak, where he only had the hashes, asking how to process.  Well I told him the regen-lost-salts logic, and took the sample hashes he had, and cracked some. Then I looked at the email, and he listed the actual db dump with the hashes.  Once I had a couple cracked, and the SQL dump, I was able to see that the hashes were stored in a user table, AND the salt were there also.  So I set out to make a quick unix script to grep/sed/awk this data into proper format for jtr.  Well, I then started seeing all sorts of problems running this data.  What it boils down to, was there were 2 length salts, a 3 byte, and a 30 byte length salt.  Well, the SSE code can only handle up to a 23 byte salt, so I thought that was the problem.  Built in x86 mode, and problem was actually worse.

Well, after digging in a bit, I found the problem.  It has to do with optimizations within dyna, and with salts of varying lengths.

Ok, in jtr, we do a salted hash like this.

1. load password
2. for each salt
2a load the salt
2b call crypt all, and check for any found passwords
2c goto step 2a until all salts done
3. get next word and goto step 1.

It is not exactly like that, but close enough for this email.

Now, within dyna, I 'use' this information.  For many hash types (dyna_6 is md5(md5($p).$s))  I perform the hash of the password 1 time.  I then re-use that hex output of that hashing, over and over, simply appending the salt, and rehashing.  Doing this cuts down the number of md5 primative calls in half (from 2 to 1 per each password/salt try).

Well to do this, within dyna, when the crypt_all is called (step 2b of the list above), I check to see if passwords were just loaded. If so, I fully clear out input field 1, then crypt the passwords, and store their hex strings at the start of the input 1 field.  Then the functions called WITHIN the format are:

set length 32
append salt
crypt

What this does is to simply place the salt after the 32 bytes of hex hash of the $p value.  Then crypt is called. 

Well, the bug would be that a longer salt, followed by a shorter salt, will leave junk data AFTER the hex + salt for this shorter salt. Now, if we were using CTX model of doing MD5, then this would never be a problem.  The CTX model will copy over the proper amount of data into the CTX's own memory space.  However, for many crypts in john (all MMX types, and the MD5_body format, which dyna uses in many builds), it actually USES the input data buffer in 64 byte blocks, expecting them to be setup properly.  Well in this case, any subsquent shorter hashes would not be setup properly.

The first fix, was to simply save the length of the last salt, and we setting length to 32, I would check to see if salt length was less than last salt length. If so, I would fix up the buffer (after the 32 byte hex value).  The problem went away.  I had to do the same fix for the MMX part of set length 32.   Ok, a good 'test' fix.

The problem is that there are many set length functions. Each would exhibit the same issue of stale data.  I did NOT want to fix dyna this way.  I made an off comment to magnum, about being able to sort the salts, based upon length, and that it would be great, and the problem should go away, but I did not think a format could do this.  Magnum thankfully remembered that we added salt sorting a couple years back for wpapsk, to try to group many common router names, which often there are 1000's, since that was the key item in the PBKDF2, and could speed up the format quite a bit.   I had forgotten about that code in loader.c. 

Well, I was able to enhance that salt sort, to keep the wpapsk code, and simply add the same code to do it for salted dyna's.  Very easy fix, and again, the problem goes away.  This code still needs a little more testing, etc, but I hope to have it checked in soon.

One thing that still bothers me is this.  The self test code did not catch this problem.  Dyna6 has salts of different length. However, the test bench code does not behave like the 'normal' run of JtR, and thus would not catch this type problem.  I think we need to look into that.

Jim.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.