Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150511164233.GA2148@openwall.com>
Date: Mon, 11 May 2015 19:42:34 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Adding OpenMP support to SunMD5

On Mon, May 11, 2015 at 05:36:31PM +0200, Frank Dittrich wrote:
> Will run 512 OpenMP threads
> Benchmarking: SunMD5 [MD5 128/128 AVX 4x3]... (512xOMP) DONE
> Speed for cost 1 (iteration count) of 5000
> Raw:	9309 c/s real, 300 c/s virtual

Great speed.

> This (higher c/s rate for OMP_NUM_THREADS >> number of cores) matches my
> experience for sunmd5 on my hardware.

This suggests that "the problem" is false sharing or something like it.
When you increase OMP_NUM_THREADS above the number of logical CPUs, you
have the threads that are actually run on the CPUs concurrently (with
the rest waiting to be scheduled by the kernel's scheduler) work on
memory regions that are farther away from each other.

I put "the problem" in quotes, because it's almost certainly not the
only one.  There's also a performance hit for just 1 thread, compared to
a non-OpenMP build.

For a slow hash like this, we should be able to achieve almost no
performance hit with OpenMP as long as the system is otherwise idle.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.