Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20101217051651.GB6958@openwall.com>
Date: Fri, 17 Dec 2010 08:16:51 +0300
From: Solar Designer <solar@...nwall.com>
To: Kees Cook <kees@...flux.net>
Cc: john-users@...ts.openwall.com
Subject: Re: benchmarks for 64-way

Hi Kees,

On Wed, Dec 15, 2010 at 02:19:51PM -0800, Kees Cook wrote:
> At Jon and Solar's urging[1], I added some benchmarks[2] too.

Thank you!  51M c/s is impressive.  The previous best speed was 21M.

I've sorted the table for decreasing c/s rate for "DES crypt() many
salts" again.

Would it be possible for you to fill out the bcrypt and LM columns as
well, though?

> The GOMP_SPINCOUNT has weird effects. :)

Indeed.

Can you try 300000, which is going to be the new default in gcc 4.5.3?

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706

Also, one benchmark of -omp-des-4 would be nice, perhaps with 300000.
It should provide better speed for "many salts" (my guess is 55M c/s).

For LM hashes, I'd try fewer than 64 threads (32, 16, 8, 4, 2, 1).

Also, there's no such column in the table, but I'd be curious of your
results for SHA-crypt - perhaps actually run "john -e=double passwd" on
a file containing a SHA-crypt hash (should be easy for you since Ubuntu
uses those by default now).  JtR 1.7.6 (even with no patches) is capable
of OpenMP-parallelizing those by using glibc's thread-safe crypt_r(3).

Finally, it'd be great if you could do a build of clean 1.7.6 without
OpenMP and add its benchmark to the second table:

http://openwall.info/wiki/john/benchmarks#Collected-john-test-benchmarks-for-one-CPU-core

This will show us roughly how much overhead is incurred with the
thread-safe code required for OpenMP-enabled builds, with thread
synchronization, and with partial serial execution.  I say "roughly"
because of Turbo Boost, which will be activated when just one CPU core
is in use (2.4 GHz instead of 2.0 GHz, if I understand correctly), and
because of SMT (it's unknown what percentage of the core's execution
units we're making use of when we run just one process).

Another curious test would be to run 64 independent processes
simultaneously, then add up their speeds (or multiply the average by 64,
which may be quicker to do if the speeds are roughly the same).

I understand that this might be taking too much of your time, which you
have plenty of other uses for, so feel free to disregard these suggestions.

> [1] http://www.openwall.com/lists/john-users/2010/12/15/5
> [2] http://openwall.info/wiki/john/benchmarks

Thanks,

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.