Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BLU159-W13FE617A67F63A2F90D3DEA4730@phx.gbl>
Date: Wed, 1 Feb 2012 02:06:00 +0000
From: Alex Sicamiotis <alekshs@...mail.com>
To: <john-users@...ts.openwall.com>
Subject: RE: DES with OpenMP


> Maybe the code should assume that if there are 4 threads or less, that's
> probably just one CPU chip - and use DES_bs_cpt=4 or 8 in that case.
> This assumption will fail if the number of threads is deliberately
> lowered to use only some cores in a multi-socket system, though.  And it
> will fail differently for bigger than quad-core CPU chips.  Not great.
> 

Hmm... yeah I can see the dilemma here of a parameter that can't be tweaked to suit all systems. 

> I suspect that it's nothing fundamental, but merely icc happening to do
> register allocation or whatever better in one version of code vs. the
> other.  It might be the other way around in a slightly different build.
> 
> These differences of a few percent are hard/unrealistic to turn in our
> favor reliably without explicit assembly code and focus on a specific CPU.
> 
> Alexander

I managed to use icc in order to build various versions... I did a run with 1-4-8-16-32-64-128-256 values (values of 32+ slow down the cracking significantly so I'm not including 64+ and I only include 32 because it's the default value). 

I did two compilation series, one with -march=core2 and one without in order to examine how much is down to cpu-specific code. The core2 variant was about 16kb larger than the generic build and had similar if not slower performance. I tested with nice --10 and -test=20 on a shell / no desktop running - no apps running - most services shut down @ 4 GHz.

What happened with gcc declining performance as the value went up, repeated itself with icc. I reached 9.36m c/s with des_bs_cpt values of 1 and 4... This is a gain of ~400k c/s with many salts. Interestingly, single salt gain was ~1.3m c/s (8.5m c/s rather than 7.2m c/s of the default 32 value).

Running a single thread, I achieved 4750k c/s for many salts and 4473k c/s for one salt with values 1 to 4. Many salts gain over the 32 value was +100k, and single salt was +200k. This is less than what happened with the dual thread version which had larger gains.

http://imageshack.us/f/408/resultsarein.png/
 		 	   		  

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.