|
Message-ID: <20110604164231.GA4371@openwall.com> Date: Sat, 4 Jun 2011 20:42:31 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: unique Jim, You probably expected that, but JFYI: unique in jumbo-5 became about 3% slower (with default settings) than it was in 1.7.7 official and -jumbo-1. I attribute this to the changes in line_hash(). Perhaps the constant shift counts and masks were faster than the variable ones are. As to optimizing this, you can easily pre-compute "vUNIQUE_HASH_SIZE - 1" and "vUNIQUE_HASH_LOG / 2". With constants, these were computed at compile-time, but now you actually have those extra operations performed at runtime, right inside the hash function. My guesstimate is that this will help a little bit, but not let us fully regain the lost 3%, because the variable shift counts will remain. So maybe we need a specialized version of line_hash() for the default settings. Then we'd need to call it via a function pointer, though, unless we also introduce specialized versions of the caller functions' loops, which feels like too much code duplication to be worth it. Hopefully, the function pointer overhead will be under 1%. ...Oh, you also have cut_len and LM checks inside the per-line loops in read_buffer() and clean_buffer(), which is probably responsible for part of the slowdown. And the checks against "vUNIQUE_BUFFER_SIZE - sizeof(line) - 8" are probably slower than checks against a compile-time computed constant were. Here's a marketing workaround: double the default memory usage by unique when the jumbo patch is applied (change UNIQUE_HASH_LOG from 20 to 21, UNIQUE_BUFFER_SIZE from 0x4000000 to 0x8000000). This will compensate for slower code when running on large files. In fact, -mem=21 results in a 6% speedup over -jumbo-1 when running on all.lst (44 MB), presumably due to the larger hash table (fewer collisions). I am not complaining. I am actually grateful for all your work. I am just documenting my findings and thoughts on the matter. Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.