|
Message-ID: <20111121165227.GA26729@openwall.com> Date: Mon, 21 Nov 2011 20:52:27 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: best way to get ciphertext Hi Samuele, Thank you for bringing this topic up - I mean not just "getting the ciphertext", but scalability issues for fast hashes in general. On Mon, Nov 21, 2011 at 02:59:23PM +0100, Samuele Giovanni Tonon wrote: > i'm trying to add key comparison inside the opencl kernel code > trying to see if this add more speed to the process. Yes, but this is difficult to do. > at the moment all the job is done inside crypt_all() in which > i set the salt, the list of cleartext password to hash, the output > buffer . > > i tried also to pass to opencl kernel ciphertext password by calling > "binary", however with my great disappoint i'm not getting the password > but some random data. What "ciphertext password"? John normally has many hashes loaded at once, including often many per salt. Having just one hash to crack (per salt, if applicable) is only a special case. > i tried to print inside crypt_all and cmp_all binary value with a simple: > > printf("cry %x %x %x %x %x \n ", ((ARCH_WORD_32 *)binary)[0], > ((ARCH_WORD_32 *)binary)[1], ((ARCH_WORD_32 *)binary)[2], ((ARCH_WORD_32 > *)binary)[3], ((ARCH_WORD_32 *)binary)[4]); No idea what binary value you're trying to print here. There might not even be a symbol called "binary" and available inside crypt_all(). Well, maybe you happen to have a function called binary(), like many formats do, and you print portions of its code here? ;-) > however while on cmp_all i get the right "numbers", on crypt_all > i get nothing valuable. > > since it looks like binary is not available inside crypt_all > (because not yet setted?) Yes, not available there. No, for more fundamental reasons. > i'm wondering which is best to do to solve > the problem which in the end is quite simple: > is there a good way to crypt and compare at the same time using the same > function or shall i go with some nasty hacks ? > Has anyone found similar problem on other formats ? This is not simple at all, and it applies to all formats indeed. There's currently no interface to communicate hashes loaded for cracking into the format code. You may introduce a dirty hack where binary() would record the hashes and then cmp_all() would compare against those, but another hurdle is that cmp_all() is not always called - when there are a lot of hashes for a given salt, the cracker.c code will use get_hash*() instead and do comparisons on its own. So you'd also need to introduce an FMT_* flag maybe to disable that or to have it enabled only when a much higher threshold of hashes per salt is reached. Better yet, we'd actually need to enhance the formats interface. Another difficulty is that your format would need to duplicate cracker.c's removal of already cracked hashes. from your own data structures. If you don't do that, you happen to have an easily cracked hash loaded initially, and you happen to try the corresponding password multiple times (e.g., as a result of wordlist rules producing some duplicates, which is normally acceptable when attacking fast hashes), you might end up having cmp_all() return true too often (and then you hit slow code paths). Besides comparison of computed hashes, another bottleneck is set_key(), which is currently called by just one thread in the main program. (This is a problem for fast hashes with OpenMP builds of John as well - e.g., this is why LM does not scale beyond 100M c/s or so on CPU currently.) I think this one can be dealt with in two ways: 1. Have an FMT_* flag that would indicate that set_key() may be called by multiple threads at once, for different key indices indeed. Have cracking mode specific code parallelized with OpenMP as well (the difficulty of actually doing this will vary by cracking mode). 2. Have an extra per-format method to specify that crypt_all() should produce and try multiple candidate passwords for every key that was set with set_key(). For example, it could specify character positions and charsets. Then we'd actually have a hybrid of smart high-level cracking modes with dumb exhaustive search for a few character positions. It could be made a bit smarter e.g. by incremental mode altering the charset for those positions to match what it currently tries for the rest. And then we also need a way for a format to say that it computed more hashes than it was supplied candidate passwords, so get_hash*(), get_key(), etc. would need to be called for larger index values as well (up to the actual number of hashes computed) - and the hash comparison optimizations mentioned above would preferably need to be used. Overall, this is complicated stuff involving some trade-offs. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.