john-dev - Re: precomputed attacks for john: rainbow tables and other ways

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150628034628.GA19043@openwall.com>
Date: Sun, 28 Jun 2015 06:46:28 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: precomputed attacks for john: rainbow tables and other ways

On Thu, Jun 25, 2015 at 07:59:47PM +0300, Aleksey Cherepanov wrote:
> Precomputed attacks

> Broken implementation
> 
> My initial broken implementation is attached too (t_old.py).
> 
> My original implementation missed the idea of "colors". So it was not
> real _rainbow_ tables. The difference: rainbow tables uses chains of
> candidates/hashes, both hash and position in chain are used to produce
> next candidate, while I used only hash.
> 
> Using only hash to produce next candidate: any collision means that
> end of chains are the same, I cut such chains and I get approximately
> N/2 chains in the end (where N is the size of attack's key space).
> Most chains were very short, but they are almost unavoidable.
> 
> Alexander Cherepanov pointed out to me that all candidates consist a
> graph with 1 arrow from node in case of single color tables.
> Connections are random and depend onto keyspace, hashing function, and
> function to map hash into next candidate. Number of chains can't be
> lower than number of leafs (nodes with 0 incoming connections).
> 
> Idea 1: I can tweak the function to map hash into next candidate:
> compute hashes from all candidates in row and remember them, then try
> various mapping functions to choose the best. Mapping function can be
> flexible (as Pearson hashing) and we can try hill climbing (or other
> meta heuristic algorithm) to improve it gradually. Due to randomness
> of connections, it should not be possible to improve connections much
> not storing a lot of data. But it may be interesting to research these
> limits.
> 
> Idea 2: if we computed all hashes and remember them, then we can just
> construct a minimal perfect hash function to map hashes back into
> candidates. We need order preserving mphf. While mphf is not a new
> topic, our certain task may have better solution than existing.
> 
> Time-memory trade off: we don't really need to have 1 function that
> maps hash into candidate. We can map 10% of hashes into their
> candidates using 1 function, another 10% with other function and so
> on, thus we need to check 10 functions (to perform 10 hashing: given
> hash -> candidate -> computed hash -> comparison with given hash). We
> can reduce total space used because building a function picks more
> comfortable candidates that need less space. An experiment is needed.

Another idea: split attack into blocks 256 candidates each, for each
block build a 256 byte look up table: first byte of hash -> number of
candidate in block. So having N hashes, you get N bytes of spaces and
look up for 1 hash in N/256 hashing operations.

But it won't be efficient when you have more than 256 hashes to crack
because you have hashes with almost all variants of the first byte.
Then bigger block size and table size can be used: 65k blocks mean 2*N
bytes to store and N/65k ops to check 1 hash.

Thanks!

-- 
Regards,
Aleksey Cherepanov

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.