|
Message-ID: <5531326A.50802@openwall.com> Date: Fri, 17 Apr 2015 19:18:50 +0300 From: Alexander Cherepanov <ch3root@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Advice on proposal: John the Ripper jumbo robustness On 17.04.2015 12:01, Kai Zhao wrote: > Note: compile without asan and afl > > $ ./configure > $ make > $ echo garbage > test.pw > $ time ../john --format=7z test.pw > No password hashes loaded (see FAQ) > > real 0m0.041s > user 0m0.038s > sys 0m0.004s > > Calculate the invoked times and execution time of each function by gprof, > attachment is the output file. > > The cfg_get_section() function occupies the most of time. Thanks, that's much, much better. > This is why > it will get 7x speed-up when the john.conf is simple, such as "[Options]". > > It is interesting why the cfg_get_section() is called 16080 times. Most of > the call is from the dynamic_IS_VALID() which is called 10000 times. > > We can optimize the dynamic_Register_formats() function which invokes > 10000 times of dynamic_IS_VALID(). Below is part of the code: > > int dynamic_Register_formats(struct fmt_main **ptr) > { > ... > for (count = i = 0; i < 5000; ++i) { > if (dynamic_IS_VALID(i, 1) == 1) > ++count; > } > // Ok, now we know how many formats we have. Load them > pFmts = mem_alloc_tiny(sizeof(pFmts[0])*count, MEM_ALIGN_WORD); > for (idx = i = 0; i < 5000; ++i) { > if (dynamic_IS_VALID(i, 1) == 1) { > if (LoadOneFormat(i, &pFmts[idx]) == 0) > --count; > else > ++idx; > } > } > ... > } > > The dynamic_Register_formats() function invokes 10000 times of > cfg_get_section(), and every time cfg_get_section() tries to find the > section from begin to the end which has lots of sections in current > john.conf. I see. Indeed, this approach is quite slow. > An way to optimize the dynamic_Register_formats() function is to > traverse all the sections and generates the result (whether valid) for > every dynamic section. In this way, we will use little more memory but > we reduce the 10000 times call to 1 time call. I think it speeds the john > without change the config file and it is not only for fuzz testing. > > Do you agree with me? I am going to implement this change. Yes, it would be nice to speed it up but it's not required. In the past, I've tried to sort hashes by running john against every hash in a loop. It was slow. But this is quite exotic workflow. For fuzzing, we can bypass these numerous cfg_get_section() calls. Either as Frank described, or by making an empty config file, or by indefining DYNAMIC_DISABLED. So... If you can easily optimize it then go ahead. But don't spend much time on it. Then please got further and find where the next bottleneck is (with dynamics registering bypassed in any way). We are still at ~100 exec/s. It would be nice to get 1000-2000 exec/s. -- Alexander Cherepanov
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.