|
Message-ID: <20160121110811.GA19059@openwall.com> Date: Thu, 21 Jan 2016 14:08:11 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: loader and cracker (prefetch) optimizations from September 2015 Hi magnum - FYI, I've just committed the loader and cracker optimizations from September 2015 to the core tree. There are a few differences from what went into jumbo back then, but almost all of those are on purpose, and should remain as differences. For example, show_uid_in_cracks is jumbo-specific, which affected these changes in a few places. So when merging these, you should mostly keep the code currently in jumbo as-is, even if the new core code does those things differently. One exception to that is SSE2 vs. SSE checks for the prefetching. It turns out those prefetch instructions and intrinsics are available with plain SSE (Pentium 3) rather than require SSE2 (Pentium 4), so let's in fact be checking __SSE__ and #include'ing <xmmintrin.h>, rather than checking __SSE2__ and #include'ing <emmintrin.h>. More importantly, our use of the NTA hint probably results in performance regressions for some hash counts (neither very small nor very large), as it reduces use of L2+ caches. I've actually seen at least one such regression, where replacing the hint with T0 helped. Unfortunately, NTA is in fact better than T0 for huge password hash files, like the 29M test case. Maybe we need to move this portion of code (the whole prefetching cracker) into an inline-able function, and have the compiler specialize it in two different ways. And we'd need some threshold parameter to choose one or the other per-salt. Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.