|
Message-ID: <fc1789db37f9de0fc806ed7aba40b315@smtp.hushmail.com> Date: Thu, 25 Dec 2014 12:46:35 +0100 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: bcrypt BF_X2=3 is not always best On 2014-12-25 03:14, Solar Designer wrote: > The 3x interleaving works significantly betterthan 2x for Intel > x86-64 CPUs without Hyperthreading (such as Core 2 Duo/Quad), but is > usually of little help or sometimes even hurts speeds on CPUs that > are capable of running 2 threads/core. > I don't know how/whether we can reasonably detect which BF_X2 setting is > best. Running benchmarks at build- or run-time is unstable or slow, > given the variance seen under light unrelated load. And these would have > to be full OpenMP benchmarks, because relative speeds are different when > running only 1 thread. I think we should use a shared cpu_detect() function for x86, so we can detect HT, XOP/AVX and other things at run time. Another thing that can differ a lot between different CPU types is what we usually call OMP_SCALE - for the nt2 format I believe 1M is best on Bull while just 4K is best on core i7. The current selection is the __XOP__ and __AVX__ macros at build time. Looking at x86.S we already have this function... is it usable as-is? magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.