|
Message-ID: <c68e4d38ee9f5a2dddbd7cc5cd069cad@smtp.hushmail.com> Date: Mon, 11 May 2015 23:06:33 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: Adding OpenMP support to SunMD5 On 2015-05-11 19:58, Frank Dittrich wrote: > On 05/11/2015 06:42 PM, Solar Designer wrote: >> On Mon, May 11, 2015 at 05:36:31PM +0200, Frank Dittrich wrote: >>> Will run 512 OpenMP threads >>> Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (512xOMP) DONE >>> Speed for cost 1 (iteration count) of 5000 >>> Raw: 9309 c/s real, 300 c/s virtual >> >> Great speed. >> >>> This (higher c/s rate for OMP_NUM_THREADS >> number of cores) matches my >>> experience for sunmd5 on my hardware. >> >> This suggests that "the problem" is false sharing or something like it. >> When you increase OMP_NUM_THREADS above the number of logical CPUs, you >> have the threads that are actually run on the CPUs concurrently (with >> the rest waiting to be scheduled by the kernel's scheduler) work on >> memory regions that are farther away from each other. > > $ git diff > diff --git a/src/sunmd5_fmt_plug.c b/src/sunmd5_fmt_plug.c > index 059ef4c..8829858 100644 > --- a/src/sunmd5_fmt_plug.c > +++ b/src/sunmd5_fmt_plug.c > @@ -32,7 +32,7 @@ john_register_one(&fmt_sunmd5); > > #ifdef _OPENMP > #include <omp.h> > -#define OMP_SCALE 1 > +#define OMP_SCALE 8 > #endif > > #include "arch.h" > > > This change alone results in > > $ ../run/john --test=10 --format=sunmd5 > Will run 32 OpenMP threads > Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (32xOMP) DONE > Speed for cost 1 (iteration count) of 5000 > Raw: 9990 c/s real, 312 c/s virtual On my core i7 laptop, OMP_SCALE 4 is best, HT or not. Bumping to 8 slightly degrades HT but does not change non-HT at all. This is with 4: $ OMP_NUM_THREADS=4 ../run/john -test -form:sunmd5 && ../run/john -test -form:sunmd5 Will run 4 OpenMP threads Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (4xOMP) DONE Speed for cost 1 (iteration count) of 5000 Raw: 2497 c/s real, 629 c/s virtual Will run 8 OpenMP threads Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (8xOMP) DONE Speed for cost 1 (iteration count) of 5000 Raw: 2671 c/s real, 345 c/s virtual magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.