|
Message-Id: <9D1CB184-0982-492D-B137-76A1E327995C@gmail.com> Date: Sat, 6 Jun 2015 23:06:08 +0800 From: Lei Zhang <zhanglei.april@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Interleaving of intrinsics > On Jun 6, 2015, at 7:47 PM, Solar Designer <solar@...nwall.com> wrote: > > Your use of VTune appears to be similar to use of gprof. If you use > VTune at all, I'd expect you to profile things such as cache misses and > pipeline stalls, as well as utilization of the CPU's execution units. > Things that only the CPU vendor's profiler is capable of. I'll take a further look. > On Sat, Jun 06, 2015 at 07:38:18PM +0800, Lei Zhang wrote: >> Same settings as the previous, except for longer run time (--test=20): > > Are the benchmark results significantly affected by your use of > profiling, vs. a non-profiled run? This is very important. In some > cases, profiling may change performance by an order of magnitude or even > worse, which means that its results would be of questionable relevance. From the results of self-test, there's no noticeable penalty in profiling. >> Use of intrinsics is counted as function calls > > That's weird. You need to make sure they haven't, in fact, been turned > into function calls or the like in this profiling build. If they have, > performance is probably at a level much worse than what we normally see, > and if so this is an instance of the problem I mentioned above. This is indeed weird. The previous experiments were done with gcc. I just profiled an icc build, and the result is vastly different: Function CPU Time SSESHA256body 21.589s pbkdf2_sha256_sse 0.287s cfg_get_section 0.062s SHA256_Final 0.012s SHA256_Update 0.008s This time intrinsics aren't counted as function calls. However, the self-test results of the gcc build and icc build show no big difference. This really confuses me. Lei
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.