Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170930230554.GA21014@openwall.com>
Date: Sun, 1 Oct 2017 01:05:54 +0200
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: OMP vs. OpenCL performance

On Sun, Oct 01, 2017 at 12:38:17AM +0200, magnum wrote:
> One (or some) of the format's test vectors have an iteration count of 
> 10000. You can benchmark it like this:
> 
> $ ../run/john -test -form:PBKDF2-HMAC-SHA512 -cost:10000
> Will run 8 OpenMP threads
> Benchmarking: PBKDF2-HMAC-SHA512, GRUB2 / OS X 10.8+ [PBKDF2-SHA512 
> 128/128 AVX 2x]... (8xOMP) DONE
> Speed for cost 1 (iteration count) of 10000
> Raw:	721 c/s real, 96.2 c/s virtual
> 
> The figure above is from a 5 yo laptop w/ 4 cores 8 threads and clocked 
> at a relaxed 2.3 GHz. Unless I'm totally senile right now, that should 
> mean a figure of about 148 c/s for 48583 iterations and you only only 
> get a tenth of that? I have no idea why (unless your gear is also 
> occupied with computing other things).
> 
> Try that exact benchmark and report your outcome. The system should be 
> idle when benchmarking, of course.

FWIW, with OpenMP even slight other load can have great impact on JtR's
overall performance, because the threads are sync'ed often.  So all
threads wait for the slowest one.  And the slowest one might become e.g.
twice slower if there's some heavy JavaScript running in a web browser
tab, competing with JtR for one logical CPU.  Ten times slower is
surprising, but not entirely unrealistic.

Our OpenMP support should be used on otherwise completely idle systems,
or else the number of threads should be reduced e.g. with:

OMP_NUM_THREADS=7 ./john ...

There are other OpenMP settings that could be tuned for better (or
rather not as bad) behavior under load, such as setting
GOMP_SPINCOUNT=10000 (to reduce busy waits, which hurt other logical
CPUs in the same cores; the default is usually higher than 10000) or
enabling of dynamic scheduling.

Or indeed "--fork=8" may be used - it is somewhat cumbersome, but it's
not impacted by other system load excessively, unlike OpenMP.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.