|
Message-ID: <20120430043248.GB6977@openwall.com> Date: Mon, 30 Apr 2012 08:32:48 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Password Generation on GPU myrice, Samuele, magnum, all - On Tue, Apr 17, 2012 at 03:19:18PM +0800, myrice wrote: > For term of "password generation", I just want to make it clear. Do you > mean that in real password generation, the single, wordlist and incremental > mode? I mentioned real because we split benchmark with real password > generation. In benchmark, we just use bench_set_keys(set key from test_fmt > one by one). Yes, at least for "fast hashes", we need to implement real (not just benchmark-specific) candidate password generation on GPU. However, it doesn't necessarily have to be the same as the current JtR cracking modes - it just needs to be something usable during a real attack on real hashes. While in the long run we could want to have JtR's usual cracking modes on GPU, I think we could start with the following: 1. Enhance the formats interface by adding an optional method like: void set_mask(int count, int *positions, char **charsets) which would tell the format to substitute all possible characters from the supplied charsets in the supplied character positions. For example, if called with (1, [5], ["abcdef"]), it would for every key set with set_key() additionally substitute every one of the six characters in character position number 5 (zero-based). So crypt_all() would produce 6 times more password hashes than we had candidate passwords provided via set_key(). cmp_all() would operate on all of the actually computed hashes. In order for the caller to know how many times to call cmp_one(), cmp_exact(), and get_hash*() - when these are needed - crypt_all() would be changed to return the number of hashes actually computed. (I am considering certain other changes as well, which may result in these interfaces being revised some further or differently, but I am not including those other changes here to avoid confusion.) 2. Add a "mask mode" to JtR. (This is desirable anyway. Other crackers have it, but JtR makes this unnecessarily difficult for users by only providing things like the KnownForce external mode instead.) In this mode, the user will specify a mask consisting of per character position charsets (or just charset identifiers such as "lowercase letter"). Indeed, this cracking mode will make use of the set_mask() feature above (when the current format provides it) for some of the character positions (iterating over the rest on its own such that the program doesn't stay inside crypt_all() forever and run out of memory for the computed hashes, but is responsive to any-key, prints guessed passwords, updated the pot/log/rec files, etc.) 3. Enhance --test to run an extra benchmark and output an extra line for formats that offer set_mask(). For salted fast hashes, we'll need to be outputting three lines: many salts (mask or not should not matter when the number of salts is large enough), one salt and mask (simulate mask-using cracking modes), one salt and no mask (simulate other cracking modes). For saltless fast hashes, it'd be two lines (no "many salts", obviously). 4. We may try to make some other cracking modes set_mask()-aware as well. Incremental mode is potentially capable of using this interface for the last character position, except when it has the last character index fixed (and thus alters character indices in other positions only). For example, when trying passwords of length 8, incremental mode would thus be able to use set_mask() 87.5% of the time in a long-running session. (The growing and reducing c/s rate may be confusing, though.) Wordlist mode with rules may potentially be able to use set_mask() for ruleset lines containing portions like Az"[190][0-9]". However, that would be bad in two ways: it would confuse the rule preprocessor with the actual rule processor (making these things even more difficult to explain than they're now) and it would swap the words vs. rules processing order for the affected ruleset lines (different behavior for different formats). So we shouldn't implement this in that exact way. Instead, we may consider introducing the ability for rules to produce multiple candidate passwords. Right now, each rule (as output by the preprocessor, when applicable) produces at most one candidate password for one input word. There are good reasons (not limited to GPU acceleration) to allow for rules to produce multiple candidate passwords. So we may add some syntax that would do just that - e.g., reuse curly braces for the purpose - and then have it use set_mask() when available (tricky to do given the current separation between rules.c, wordlist.c, and cracker.c - so we'll need to revise that, even though it may make the source code structure more complicated). BTW, the set_mask() thing is good not only for GPUs, but also for OpenMP-parallelized (or otherwise parallelized) formats running on CPU and achieving high c/s rates. Specifically, I'd use it for LM hashes, which currently don't scale beyond the performance of approximately 1.5 threads on CPU. (Currently, we have to use other approaches to parallelize these - such as jumbo's MPI support or running multiple separate instances of John - where each separate process generates its own stream of candidate passwords.) set_mask() is also good in that it may be closely coupled with the format's crypto code when that is helpful. For example, for LM hashes the already-transformed key bits matrix may be modified. So theoretically it might provide greater speed than what we'd achieve by having each thread generate its entirely independent stream of candidate passwords. As usual, I'd appreciate any comments. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.