Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080608172029.GA23746@openwall.com>
Date: Sun, 8 Jun 2008 21:20:29 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: How to chose computer for John?

On Sun, Jun 08, 2008 at 06:40:09PM +0200, sebastian.rother@...erlin.de wrote:
> Quad core? And what do the other 3 cores do in the meanwhile or did JtR
> gained a good SMP-capable implementation I ma missed?

You know what I mean.  I had mentioned it explicitly in the message
you're replying to.  Run one instance per core.  Same directory and
shared john.pot is OK - this is in the FAQ.

> You claim a Q6600 is 2.5 times faster if all cores are used so I'd like to
> know how you do use ALL cores for a single session with JtR.

Not for a single session indeed.  Four sessions.  Somehow I don't have a
major difficulty doing that myself.  I am a bit surprised that some
others find this prohibitively difficult, although I understand that
built-in multi-CPU/multi-core support is indeed highly desirable.

> But running 4 instances does not mean it is "faster" at all.

If you do it correctly, the cumulative c/s rate is 4 times higher, and
the order in which candidate passwords are tried is almost as good (yes,
maybe just a little bit worse - so the overall speedup is less than 4
times, but is close to it).

> The only way I can imagine such a method is by splitting
> the password file/4 and run each part in a single and seperated instance.

This is by far not the only method, and not the most efficient one most
of the time (although in some special cases it's fine).  Other methods
to split the work have been mentioned on this mailing list before.  The
one that I use most of the time is to split by length of candidate
passwords.  One core does lengths 0-5, and the remaining three do 6, 7,
and 8 respectively.  For large numbers of salted hashes this works
fine as a replacement to having the "incremental" mode try lengths 0-8
in one session.  For very few salts (or no salts) and for fast hashes,
the "0-5" and "6" sessions actually terminate after a while, so they may
be replaced with something - the options here are a huge wordlist with a
large number of word mangling rules (beyond the default set of rules),
"--external=Keyboard", etc.  This is what I do.  And of course, one
needs to run "single crack" and at least a small wordlist with rules
first, which also occupies 1-2 cores initially.  In some cases, for
convenience, I started 5-6 sessions in this way initially, knowing that
the "extra" 1-2 sessions will terminate in a few hours leaving exactly 4
processes running on the quad-core.

> Wouldn't it be possible to may splitt JtR into a "daemoN" and serval
> clients wich do communicate via IPC? So you could propably use all Cores.
> One "JtR"-Client for each core. I am unsure if this works for john but it
> came into my mind right now. :)

I think this topic has been beaten do death already.

Yes, this is possible.  There are third-party parallel processing
patches for JtR, and many of those are available here:

	ftp://ftp.openwall.com/pub/projects/john/contrib/parallel/

My own preference is to use the manual approach described above so far.

Also, I am indeed supposed to implement this enhancement to JtR in an
"official" fashion, but I haven't gotten around to doing it for 11 years
now (it was first planned and I had draft code working in 1997).

> Also I'd like to know if SSE3/4/5 may improve something compared to SSE2.

SSE3 - as far as I know, no, not for the currently supported hashes at
least.  I have no idea what SSE4/5 are.

> > No, the version of Windows (except for Windows 9x) makes no difference
> > for JtR performance.  However, you may consider 64-bit versions of both
> > Linux and Windows, and making a 64-bit build of John.  This results in a
> > 10% performance improvement for DES-based hashes supported by John
> > natively.  (Not because of the 64-bitness itself - John uses 128-bit
> > SSE2 vectors either way - but because of availability of 16 registers in
> > 64-bit mode as opposed to just 8 in 32-bit mode.)
> 
> That affected AMD64 CPUs some years ago.
> Is that statement still true Solar?

You've quoted quite a lot of text, so I'm not sure what exactly you're
referring to.

Maybe 8 vs. 16 registers?  It is a property of the instruction set
architectures (the i386 architecture that is used by 32-bit x86 CPUs and
in 32-bit mode vs. the x86-64/AMD64/EM64T architecture that is used by
x86-64 CPUs in 64-bit mode).  It is not a property of specific CPUs.  So
nothing can change here, unless/until another architecture extension is
introduced and supported by operating systems (to save and restore those
extra registers on context switches).

> > Also, if you go for a
> > quad-core CPU, you'll want a suitable Windows license that will let you
> > make use of all "CPUs".  I'm not familiar with Windows licensing, but I
> > think that XP used to be licensed for 1-2 CPUs only "by default", and
> > you needed a more expensive license for more CPUs.  I don't know about
> > Vista.  Perhaps someone else will comment on this.
> 
> As far as I know Vista supports just up to 8 Cores wich may hit the market
> this year already. But I could be wrong.
> To be "sure" you should use the Premium-foo.

I think the technical limitation of Windows was 32 CPUs, and now it is
up to 64 CPUs (cores).  So we're talking about licensing here.  I am
just unsure if the cheaper Windows licenses allow for the use of more
than 2 CPUs or cores.

Alexander

-- 
To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply
to the automated confirmation request that will be sent to you.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.