john-dev - Re: get_source() and bitmaps boost

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3b9ed5406eed319155f4d7724f5540ec@smtp.hushmail.com>
Date: Thu, 07 Jun 2012 10:08:52 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: get_source() and bitmaps boost

On 06/07/2012 09:17 AM, Frank Dittrich wrote:
> On 06/07/2012 01:44 AM, magnum wrote:
>> I got the idea to use the 6.5 million leaked hashes for some speed&
>> memory tests. All tests consist of running --incremental=digits to
>> completion, with all those candidates already cracked (so crack prints
>> does not slow things down).
>
> Which hardware and which build target did you use?
> (I assume you are using a recent x86_64 ubuntu. Am I right?)
> How much memory does the system have?

Just my good ol' core2duo laptop. 4 GB, 3 MB cache.

>> Memory usage (RSS peak):
>> magnum-jumbo: 875 MB
>> magnum-jumbo w/ get_source added: 530 MB
>> bleeding-jumbo: 674 MB
>> bleeding-jumbo w/ get_source reverted: 1 GB
>>
>> Bleeding is 43% faster than magnum-jumbo because of these two changes,
>> mostly because of the bitmaps. One way to put it is that the get_source
>> patch regains all memory the bitmaps use, and much more. And it boosts
>> bleeding-jumbo by another 6%.
>
> This also means, that you probably could use a password list that is
> about 1.6 times the size that can be used with magnum-jumbo, if you
> apply the get_source patch.
> (Memory usage is about 1.65 times as high without the patch, but for the
> attack (wordlist / incremental / ...) you'll also need some memory.)

Maybe, but this depends on the format. For raw SHA-1 I think we save 53 
bytes of memory per loaded hash. For raw MD5 it's 44 bytes (ie. it's the 
ASCII hex hash + tag + null byte). But then I suppose we need to 
subtract sizeof(char*) from that figure, or something like that.

> On which platforms did you run the test suite for magnum-jumbo with the
> get_source patch applied and for bleeding-jumbo?
> Which platforms need to be tested?

I run amd64 Linux platforms. Jim runs Win32. In general (not specific to 
get_source) we need testing on Mac and Sparc at least.

> What other tests could help finding any hidden bugs?
> (Running real cracking sessions against "real" hashes, and compare the
> results? What else?)

I do not expect get_source to be any more buggy than any other part of 
the tree now. It's straight-forward code and we ironed it out in a 
couple of days. The bugs that was mentioned on the list a couple of 
weeks ago was *during* development and we haven't had to change a thing 
since Jim nailed the salt stuff. But the more testing in various ways, 
the better. We need to test everything now anyway, for upcoming Jumbo.

Formats that has get_source() are currently: crc32, nt, nt2, raw-md5, 
raw-sha1, raw-sha1_li and sapg, plus some or all dynamic formats, I 
presume all of them. Other formats are even more unlikely to be affected 
by any problems from the new code.

magnum

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.