|
Message-ID: <00eb01cc5d20$1ac0ee90$5042cbb0$@net> Date: Wed, 17 Aug 2011 15:56:15 -0500 From: "jfoug" <jfoug@....net> To: <john-dev@...ts.openwall.com> Subject: RE: pkzip encr >> Raw: 6594M c/s That should read 6594K. Here is a test I just ran: $ ./john -test=3 -form=pkzip Benchmarking: pkzip [N/A]... DONE Raw: 6553K c/s There is 2 hashing things which happen within the pkzip format. The first is a quick checksum. It is simply a loop over the password, and the first 12 bytes of the file. It uses a couple calls to crcupdate, and a lookup in a pre computed multiply table (and some shifting, etc). Then, when that checksum is computed, it will match 1 or 2 bytes of either the CRC32 or the DOSDATE field. The unix zip's are less secure, and checksum 2 bytes. PKZip 2.0+ and WinZip only checksum 1 byte. Thus, for PKZip/WinZip, 254 out of 255 candidate passwords are tossed, very quickly. For Unix, it is one out of 64k need to be looked at further. I have added ability to put multiple 'checksum' blocks into a single hash (up to 8). This is using the 'assumption' that all files in a single zip, would have the same password. Thus, if there are 4 of them, and it is a unix zip, then the checksum is actually 8 bytes, which is pretty good all on it's own. There can be up to 8 files in a john input line. In the call to crypt, all of the hashes are computed, and if they ALL match properly (1 or 2 bytes), then crypt will set the checksum value to succeed. If any of them fail, then crypt sets the value to ~checksum. So, then cmp_all works, cmp_one works. Then in cmp_exact, I perform other tests. These require doing the same decryption, but not just on the first 12 bytes (which are the IV), but on part or all of the file. I have built in another assumption, that someone would likely know that a file is ASCII. If that is the case, then only a small part of the encrypted data needs to be there. I then call inflate on the data provided, and make sure that it inflated a 'normal' amount of bytes (or more), and that all of them are ascii. If not, then cmp_exact returns 0. Now, in the testing I did, where there were false positives, it is where this ascii test succeeds. I have to also do a 'full' test, which is the 'binary' test. For the binary test, I have to either place all the data into the hash line (small file), along with lengths, and CRC32. Then I unencrypt the blob with the found PW, then uncompress, and perform CRC32. If the blob uncompresses to proper size, and passes the CRC32, then I 'assume' the password is correct. Now, I will have to address handling files that are too large to jam into a hash line. In that case, the original .zip file will have to be present, and the hash line will provide all of the information to quickly load that data. The data will be loaded one time only, and from that point on, be used to decrypt/inflate/crc each time a possible candidate is found. This last part of the format is not yet done. That is why I am getting false positives. However, in 'real' runtime, I am getting 1.5MB/s or so. That is on the 2 byte checksum Unix hashes. I am sure that the 1 byte checksum hashes will be much slower. As a side note, FCrackZip is able to test about 600/s (for challenge 4) if you change it to ONLY test if 2 byte checksum's match. The default is it checks if the 1 byte checksum matches, which slows it down to a crawl. I was able to test a 1.4GB dictionary file (against challenge 4), in about 8 hours. With the new john pkzip format (again, NOT doing the full inflating yet), it took about 2 minutes to run through this same file (130 million lines). There should be no appreciable slowdown for 'ASCII' optimized files, even though they do have to fully decrypt/inflate/crc a file. That is ONLY done if the ASCII test says it is a possible password (probably 1 out of a couple hundred million tests, for the 2 byte checksum). Well, I hope to have this working at least to an alpha release level shortly Jim. >From: Solar Designer [mailto:solar@...nwall.com] > >Jim - > >This is a very welcome addition, thank you! > >On Wed, Aug 17, 2011 at 02:34:52PM -0500, jfoug wrote: >> Benchmarking: pkzip [N/A]... DONE >> Raw: 6594M c/s > >How's that speed even possible with the current formats interface and on >current CPUs? "dummy" only achieves up to approx. 130M c/s on "--test" >(which is reported more like 130000K c/s, indeed). > >Are you somehow skipping impossible keys, yet counting them? Just a >guess. But even in that case you'd need to hack code outside of the new >format definition in order for the speedup to be seen on benchmarks. >So this got me curious. > >Thanks again, > >Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.