Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55E64301.10303@cox.net>
Date: Tue, 1 Sep 2015 19:29:53 -0500
From: JimF <jfoug@....net>
To: john-dev@...ts.openwall.com
Subject: Re: 23% performance regression for brypt (Intel i5-4570
 CPU)


On 9/1/2015 7:00 PM, magnum wrote:
> If only well had a regression, perhaps decrease to X2 only for 
> __AVX2__? Or maybe this is also about gcc version?

Umm, no. The improvement seems to be even better on AVX2.  I really 
think this

#if __AVX__
BF_X2=1
#else
BF_X2=3
#endif

seems to be almost universally wrong.  Possibly there is some system 
which regressed to a slower speed (or like you mention a gcc version), 
but I have tested Intel-AVX, AMD-XOP and Intel-AVX2 systems. All get 
good gains going from BX_X2=1 to BF_X2=3   A 25-30% gain is NOT 
something trivial.  Losing that 30% gain because there is some system 
somewhere which lost 5% seems truly wasteful.

I still say if we can not get this static set variable changed to a 
better value that we simply ignore it within arch.h, and probe the 
system to find out what IS the best option for the machine at 
./configure time.  This certainly would be a 'fat' probe, but would get 
the value correct for the build, which MAY change based upon things like 
gOMP, compiler version, CPU, even the memory speed may change the 
optimal value.  If we put it into configure, then we will probably want 
to do it with an --enable-bcrypt-probe since it may be pretty darn time 
consuming, finding the best value.  We may have to build a minimal JtR 2 
(or 3?) times, running -test to find proper best speed.  The compile 
could be made minimal by ripping out all *_plug.c files, BUT it still is 
a compile that would have to be done.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.