|
Message-ID: <BLU0-SMTP176E4B576F71CED2D39D6B6FD0C0@phx.gbl> Date: Thu, 24 Oct 2013 14:01:44 +0200 From: Frank Dittrich <frank_dittrich@...mail.com> To: john-dev@...ts.openwall.com Subject: benchmark-unify and relbench adjustments magnum, Solar, all, changed format names and changed --test output require some adjustments to increase the number of benchmarks which can be compared (between bleeding and earlier versions, e.g., unstable. Output for dynamic formats has been changed. Commit 197f067 moved the expression from the format name into the algorithm section, i.e., after the '['. The expressions for the new formats dynamic_1400 and 1401 contain "[...]", so that the "Benchmarking: " output now contains nested "[...[...]...]". Some formats have been renamed. Several of the issues mentioned above have been addressed by the attached patches (intended for bleeding, not for unstable). Because I'd like these patches to be discussed here, I didn't create pull requests. 0001-benchmark-unify-sort-format-name-mappings.patch just sorted the format name mappings (the __DATA__ section) alphabetically (ignoring case), so that it is easier to see which mappings had/have to be added, adjusted, or removed (the dynamic changes allowed a few of the dynamic mappings to be removed). So, this patch contains no functional changes. (A similar patch for unstable might be useful if we need to adjust mappings for unstable as well, so far I didn't check this.) 0002-benchmark-unify-adjustments-for-dynamic-formats.patch deals with the changes required for dynamic formats. Because commit 197f067 moved the expression out of the format name and into the algorithm, it was easy to even convert the md5_gen(n) from older jumbo versions (I tested 1.7.6-jumbo-12) to dynamic_n. 0003-minor-relbench-improvements-and-adjustments.patch contains some relbench adjustments, also described in the commit message. I suppressed some "Could not parse" errors: -ignore "Benchmarking: .*\.\.\. FAILED" -older john versions had "All \d+ tests have passed self-tests", now it is "All \d+ formats passed self-tests" or "\d+ out of \d+ tests have FAILED" (If at least 1 test failed in one of the input files, I'll print that information, before printing the "Only in File [1|2]: ..." information.) Instead of printing the individual benchmarks (those that exist only in 1 file, or the ratio in case -v has been used) in "random" order, they'll now be sorted. The previous relbench version suggested running benchmark-unify even if there's no way this would increase the number of matching benchmarks. (E.g., the old john version just had a "Raw" benchmark for a particular format, while the new version has "Many salts" and "Only one salt" for the same format.) This has now been fixed. Format names that changed between unstable and bleeding have not yet been addressed. Main reasons for not dealing with it right now: -there are still several formats in unused (which should really be in a separate subdirectory, e.g., broken) because they fail self test -openssl-enc still fails self test: (promiscuos valid ($openssl$)) -some formats might still need to be renamed. I'd suggest renaming "Office ..." to "MS Office ..." and "M$ Cache Hash (DCC)" to "MS Cache Hash (DCC)", as well as "M$ Cache Hash 2 (DCC2)" to "M$ Cache Hash 2 (DCC2)". Finally, I don't know yet how to handle this change: In addition to the format name, the format label has been included into the output, at least if format label and format name differ. This change has 2 major consequences: -while relbench used to recognize different implementations of the same hash algorithm in the same input file (and picked the fastest one for comparison), this no longer works -almost all format names of older john --test runs would need to be mapped to the new output (<label>, <name>); this is neither an elegant solution, nor does it work for formats with more than one implementation. I was experimenting with another solution (this patch just checks an environment variable instead of a new benchmark-unify option, but I can provide a patch which introduces a --drop-labels option): ----------------------------------------------------------------- diff --git a/run/benchmark-unify b/run/benchmark-unify index febd4ea..d266b38 100755 --- a/run/benchmark-unify +++ b/run/benchmark-unify @@ -56,6 +56,9 @@ sub parse if (defined($renamed{$name})) { $name = $renamed{$name}; } + if ($drop_labels == 1) { + $name =~ s/^[^\s]*, (.*)/$1/; + } print "Benchmarking: $name $end\n"; } else { @@ -65,6 +68,15 @@ sub parse $_ = ''; +# For now, just an environment variable, I might add a +# ./benchmark-unify option --drop-labels[=0|=1] later +if(defined $ENV{'JOHN_BENCHMARK_UNIFY_DROP_LABELS'}) { + $drop_labels = 1; +} +else { + $drop_labels = 0; +} + while(<DATA>) { chomp; ($old_format, $new_format) = /^(.*) (.*)$/; ----------------------------------------------------------------- If JOHN_BENCHMARK_UNIFY_DROP_LABELS is defined, e.g., by using $ JOHN_BENCHMARK_UNIFY_DROP_LABELS=1 ./benchmark-unify ... then everything that looks like a format label will be dropped. This will increase the number of benchmarks in bleeding that can be compared with benchmarks in unstable, but it has unwanted side effects: -the user would need to run benchmark-unify even on the --test output of the latest john version (so far I always aimed at converting older output to the newest, and keeping the newest output unchanged) -in some cases, the format name will become (completely) meaningless and/or ambiguous: Fortigate, FortiOS FortiOS MSCHAPv2, C/R C/R mschapv2-naive, MSCHAPv2 C/R MSCHAPv2 C/R (so, these 2 implementations aren't recognized as the same hash format, and just "C/R" is completely meaningless) OpenVMS, Purdy Purdy WoWSRP, Battlenet Battlenet Clipperz, SRP SRP Drupal7, $S$ (x16385) $S$ (x16385) IKE, PSK PSK MongoDB, system / network system / network Office, 2007/2010 (SHA-1) / 2013 (SHA-512), with AES 2007/2010 (SHA-1) / 2013 (SHA-512), with AES PBKDF2-HMAC-SHA256, rounds=12000 rounds=12000 PBKDF2-HMAC-SHA512, GRUB2 / OS X 10.8 GRUB2 / OS X 10.8 PST, custom CRC-32 custom CRC-32 RAdmin, v2.x v2.x LastPass, sniffed sessions sniffed sessions STRIP, Password Manager Password Manager Raw-SHA, "SHA-0" "SHA-0" Raw-SHA1-ng, (pwlen <= 15) (pwlen <= 15) Mozilla, key3.db key3.db Some of this has been discussed in the past: http://thread.gmane.org/gmane.comp.security.openwall.john.devel/9522/focus=9688 http://www.openwall.com/lists/john-dev/2013/08/19/ But so far, no solution has been found. I think the best option is to (optionally) remove the format labels in benchmark-unify, even if this requires the user to run benchmark-unify on newest john --test output (which wasn't necessary for older john versions), and adjust the few format names which get meaningless if we do so. IMHO, we should try to have the same format name output (possibly after removing the format label) for different implementations of the same algorithm. That includes OpenCL and CUDA implementations. If someone wants to compare the performance of CPU formats and CPU formats separately, this can be done by running ./john --test --format=cpu ./john --test --format=dynamic ./john --test --format=gpu ./john --test --format=opencl ./john --test --format=cuda We need to come up with a solution before we can release bleeding as the next john-1.8 jumbo version. Frank View attachment "0001-benchmark-unify-sort-format-name-mappings.patch" of type "text/x-patch" (2882 bytes) View attachment "0002-benchmark-unify-adjustments-for-dynamic-formats.patch" of type "text/x-patch" (1698 bytes) View attachment "0003-minor-relbench-improvements-and-adjustments.patch" of type "text/x-patch" (5863 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.