Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150427105145.GA31597@openwall.com>
Date: Mon, 27 Apr 2015 13:51:45 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: [GSoC] JtR SIMD support enhancements

On Mon, Apr 27, 2015 at 07:57:44AM +0300, Solar Designer wrote:
> After "scl enable devtoolset-2 bash" to enable gcc 4.8.1, I get:
> 
> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 AVX 4x3]... (32xOMP) DONE
> Raw:    635904 c/s real, 19872 c/s virtual
> 
> and per my records, with some older versions of jumbo it was 638976 c/s.
[...]
> So our target speed for Xeon Phi 5110P might be:
> 
> 638976 * 60*512*1.053 / (16*128*2*3.0) = 1682104 c/s

I was wrong in assuming our CPUs would only run two SIMD instructions of
the types we need (bitwise ops).  Turns out they happily run 3 of them
(ports 0, 1, 5) at once.  Tested on Sandy Bridge-EP (super) and Haswell
(well).  And on Haswell this applies to AVX2 256-bit bitwise ops as well.

So maybe our target md5crypt speed for 5110P is only:

638976 * 60*512*1.053 / (16*128*3*3.0) = 1121403 c/s

... or even a bit less, if max turbo isn't reached when running this
code on all cores on super's CPUs.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.