|
Message-ID: <20051121093430.GA7685@openwall.com> Date: Mon, 21 Nov 2005 12:34:30 +0300 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Speed up John On Thu, Nov 17, 2005 at 09:57:24AM +1100, David Luyer wrote: > Considering long runs of john (where you're running for weeks to years > over the one set of passwords), john can be run with a large wordlist in a > reasonable amount of time (weeks at most). No need to parallelize. The > same with running incremental:digits or using an external to do other > sets like 9 or 10 digits. That's true, -- except for the slowest hash types such as bcrypt. > The one thing which is slow is incremental:all. > > The way incremental:all works is in passes, as you can see in the .rec > files. In about a month running on about 100 crypts you'd probably do > around the first 2000 passes on a fast system. Yes. You might be surprised that with 1 or 100,000 hashes loaded for cracking, you would be at around 2000 "passes" after a month, too. The reason for this is that each subsequent pass takes longer to process. The first thousand of passes is done quickly, the second thousand takes a while, and the third thousand is almost never completed. The meaning of these passes is revealed with this comment in charset.h: /* * Cracking order. * * This is a list of current {length, fixed position, character count}. * There are CHARSET_LENGTH different lengths, and fixed position is up * to the current length, which means we have exactly (CHARSET_LENGTH * * (CHARSET_LENGTH + 1) / 2) different {length, fixed position} pairs; * for each such pair we need to try all charsets from 1 character and * up to CHARSET_SIZE characters large. */ unsigned char order [CHARSET_LENGTH * (CHARSET_LENGTH + 1) / 2 * CHARSET_SIZE * 3] CC_PACKED; With the default settings this gives us a total of: 8 * 9 / 2 * 95 = 3420 passes > A way to distribute the load sensibly would seem to be to have a server > keep track of the passes as they are done and communicate the hashes > as they are cracked; > > server data set: > [crypt string, plaintext string, source client id, crypt id] > [client id, last sent crypt id] > [sequence number, client id, start time, end time] > > client starts, requests sequence number, works on that pass of > incremental:all; server stores sequence number, client id, start > time > > client cracks crack password, sends it to server, server sends it > to all online clients to add to their john.pot and updates the > data set to know it has been sent (any offline clients get it as > they connect) > > client ends pass and sends notice to server, server storess end > time and allocates first non-started pass to client > > If a client fails the server should be able to have an entry deleted > so that a new client will be allocated the same sequence number. > > Comments? Would this be seen as a reasonable distribution of workload > and not a particularly bad cracking sequence? Yes, this approach certainly works better than one implemented in dJohn and presumably in JohnNet. This is essentially what Ryan Lim's jtr-mpi does, although it does not handle any failures and I don't think it would be able to correctly recover an interrupted session either. A non-public John hack I've seen also does this and in a way more similar to what you describe above. A problem with this approach is that chunks of keyspace assigned to slave nodes become larger and larger. A single such chunk may be taking days to process. There are also not too many such passes, which will limit scalability, albeit not too severely (this won't work well with more than a few hundred of nodes). These problems are relatively easily resolved, although I have not seen that done other than in my own non-public hack I did in 1997. It is sufficient to further split each pass into as many ranges of candidate password indices as there are slave nodes. Then assign those ranges to the nodes. Whenever a node is done with its range, another node's range is shrunk removing a half of its uncompleted part of the range (and that node is notified) in order to put the idle node back to work. Note that this approach does not cause increasing keyspace fragmentation (which would result in growing memory consumption on the master). Similarly, if the master determines that a node went offline, it reassigns its entire range to whatever other node becomes idle first (there's room for further optimizations to handle transient network errors here). The master advances to the next pass only after all online nodes are done with their ranges. -- Alexander Peslyak <solar at openwall.com> GPG key ID: B35D3598 fp: 6429 0D7E F130 C13E C929 6447 73C3 A290 B35D 3598 http://www.openwall.com - bringing security into open computing environments Was I helpful? Please give your feedback here: http://rate.affero.net/solar
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.