|
Message-ID: <20140919170635.GA28734@openwall.com> Date: Fri, 19 Sep 2014 21:06:35 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Restart work on mask mode Hi Sayantan, On Fri, Sep 19, 2014 at 08:16:58PM +0530, Sayantan Datta wrote: > I am glad to inform that I'm resuming my work on mask mode and to begin > with, I'm trying to find the bottleneck on cpu portion of mask mode which > is quite slower compared to incremental mode. There isn't really a single bottleneck there. It's just slower code. > To my surprise I can't get anything faster than 7.2 Mp/s(compared to 15Mp/s > on inc mode) I'm puzzled as to why the speeds are this low for you. Here's what I am getting on one core in FX-8120: $ cat pw-dummy $dummy$ $ ./john -inc -min-len=8 -max-len=8 pw-dummy Loaded 1 password hash (dummy [N/A]) Warning: no OpenMP support for this hash type, consider --fork=8 Press 'q' or Ctrl-C to abort, almost any other key for status 0g 0:00:00:07 0.00% (ETA: 2017-08-24 19:32) 0g/s 62951Kp/s 62951Kc/s 62951KC/s kameto01..kamets99 0g 0:00:00:13 0.00% (ETA: 2018-01-11 20:55) 0g/s 63118Kp/s 63118Kc/s 63118KC/s pwektu10..pwekhrd2 Session aborted $ ./john -mask='?a?a?a?a?a?a?a?a' pw-dummy Loaded 1 password hash (dummy [N/A]) Warning: no OpenMP support for this hash type, consider --fork=8 Press 'q' or Ctrl-C to abort, almost any other key for status 0g 0:00:00:06 0.00% (ETA: 2022-01-06 13:12) 0g/s 24760Kp/s 24760Kc/s 24760KC/s "+W'a.. "+W)L 0g 0:00:00:14 0.00% (ETA: 2023-02-13 10:53) 0g/s 25004Kp/s 25004Kc/s 25004KC/s $<XYC.. $<X[. Session aborted As you can see, it's 63M vs. 25M, so still a substantial difference, but both are much higher speeds than yours. > even with this simple password generation loop: > > while(!crk_process_key("a")); Are you saying this gives you only 7.2 Mp/s? That's puzzling. What hash type? Or are you using --stdout (if so, that's the bottleneck)? > replacing the original code: > > while ((word = rpp_next(&ctx))) { > if (options.node_count) { > seq++; > if (their_words) { > their_words--; > continue; > } > if (--my_words == 0) { > my_words = > options.node_max - options.node_min + 1; > their_words = options.node_count - my_words; > } > } > if (ext_filter(word)) > if (crk_process_key(word)) > break; > } As you can see, this calls rpp_next() to obtain each new candidate password. This differs from incremental mode's code, which updates its own local buffer directly. Further efficiency differences can be easily seen if you compare what's inside rpp_next() vs. what happens in when incremental mode only updates the last character's index. Rather than look for "bottlenecks" here, we simply need to write mask mode code that is at least as efficient as incremental mode's code. It will need to stop using rpp. My reuse of rpp for early mask mode code was to demonstrate how it could be implemented. rpp was never meant to be used as performance-critical, so it wasn't optimized to that extent (nor do I recommend optimizing it, as that would complicate it; instead, I recommend writing new code specific to mask mode). > Manually, for now, I'm unable to find the bottleneck. So does anyone knows > if there is any good cpu profiler for linux64 machines ? To determine which functions most CPU time is spent in, gprof would work fine. To determine CPU instruction sequences taking excessive amounts of time, you could use something like Intel's VTune. However, I really don't think this is what you need to optimize mask mode. It is obvious that rpp_next() is slower than inc.c's typical internal loop - there's no need for a profiler for that. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.