Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC6_mQMBEhVVDEdynbSffd5UKrC2TJ8ocOn2HW9uQ-hPo06V9g@mail.gmail.com>
Date: Sat, 19 Nov 2011 19:55:50 -0500
From: Stephen Reese <rsreese@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: OpenMP not using all threads

On Sat, Nov 19, 2011 at 5:58 PM, Solar Designer <solar@...nwall.com> wrote:
> On Sat, Nov 19, 2011 at 04:29:12PM -0500, Stephen Reese wrote:
>> I have patched john-1.7.8.tar.gz with john-1.7.8-omp-des-7.diff.gz in
>
> -omp-des-7 is good if you want to attack just one salt or very few
> salts.  For many salts, -omp-des-4 provides better performance.  (This
> is mentioned on the wiki.)
>
> Alternatively, if you feel adventurous, you may do a CVS checkout for
> even newer code (currently known as development version 1.7.8.7), which
> combines the best properties of these two patches into one source code
> tree (no patches are needed).  The CVS checkout instructions are here:
>
> http://www.openwall.com/Owl/DOWNLOAD.shtml
>
>> order to utilize four threads from a E5520 on a Debian system but it
>
> Actually, you should be running 8 threads on this CPU unless you have
> Hyperthreading disabled.  But you don't need to worry about that - gcc's
> libgomp will run as many threads as you have logical CPUs by default
> (most likely 8).
>
>> instead seem like it is only using two. When testing DES I see around
>> 2500K c/s and when patched about 5000K. I was hoping for closer to
>> 8000K to 10000K.
>
> Yes, you should get about 9000K (for 8 threads combined).
>
> The increase may be less than 4x because the thread-safe code is slower,
> because the CPU clock rate is lower when all cores are in use (E5520 has
> Turbo Boost), and for certain other reasons.  Yet you should in fact get
> 9000K or so.
>
>> I also edited the Makefile as follows:
>>
>> # gcc with OpenMP
>> OMPFLAGS = -fopenmp -msse2
>>
>> Another strangeness is when testing is I am not seeing the -16
>> appending to the following:
>>
>> Benchmarking: Traditional DES [128/128 BS SSE2]... DONE
>>
>> Is this normal or did something go wrong.
>
> It looks like you made a 32-bit build.  That is, you probably used the
> linux-x86-sse2 make target instead of linux-x86-64.  When you build
> without OpenMP, the -sse2 target uses assembly code supplied with JtR,
> however when you go for OpenMP, gcc has to generate thread-safe code
> instead.  It does this well for x86-64, but not for 32-bit x86 (there
> are too few registers on 32-bit x86).
>
> To get decent performance at DES with OpenMP builds on your machine, you
> ought to make 64-bit builds.  And indeed your install of Debian should
> be 64-bit, too.
>
> I hope this helps.
>
> Alexander
>
> P.S. The Subject is almost certainly wrong - there's no indication that
> the build doesn't use all threads.  Rather, the threads are slow.
>

Alexander,

Thanks for the great information and noted about the Subject line. The
tests were on a Linode which is shared XEN hosting.

I had a feeling that the 32-bit architecture might be an issue as I
noticed that "OpenMP example" was only twice as fast (32-bit OpenMP)
instead of four times (64-bit OpenMP).
http://openwall.info/wiki/internal/gcc-local-build#OpenMP-example.
Though OpenMP example is four times as fast neither the CVS nor
stable/patch versions of John would provide the 4x speed-up I was
hoping for even on the 64-bit. Maybe XEN and the other respective
hosts across the multiple Linodes I am testing are causing roughly a
45 - 60% slowdown from a bare-metal instance but not affecting the
"OpenMP Example".

root@:~# time ./loop2
615e5600
real    0m2.229s
user    0m2.226s
sys     0m0.002s
root@:~# time ./loop
615e5600
real    0m0.333s
user    0m1.313s
sys     0m0.003s

What I am trying to achieve: I have 42 DES passwords and three
Linodes. Password list is currently split-up so each host has 12
entries and are running in incremental mode. Is there a better way,
such as specifying a thread per instance on a single host?

Is there a performance/time benefit in splitting up the password list
amongst multiple hosts or is one host going to achieve the same
results as the three?

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.