john-users - Re: nVidia Maxwell support (especially descrypt)?

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151007073834.GA15348@openwall.com>
Date: Wed, 7 Oct 2015 10:38:34 +0300
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: nVidia Maxwell support (especially descrypt)?

Sayantan,

Your e-mail quoting is slightly broken, I hear people say one needs to
set Gmail to text-only mode or something to get proper quoting from it
these days.

On Wed, Oct 07, 2015 at 12:46:31PM +0530, Sayantan Datta wrote:
> We now have auto-selection of kernels enabled for specific devices. Both
> Titan X and tahiti are configured to use harcoded-full-unrolled kenrel.

Oh.  Are the speeds I reported for Titan X normal - that is, are you
seeing the same?  (Clearly, they should be much better once optimized.)

> While CPU devices use basic kernel, previously selected by setting
> HARDCODE_SALT to 0 and  FULL_UNROLL to 0. If you'd like to manually change
> config, then set the flag OVERRIDE_AUTO_CONFIG to 1 and change other
> parameters as before.

I think there should be a way to override this at runtime, such as from
john.conf or from the command-line.

Sometimes the need to wait for all kernels to be built, even if from
previously built binaries, is too painful.

BTW, what does building from binaries mean here?  Why does it still take
a moment per kernel?  Any way to speed this up?  For 4096 kernels, even
100ms per kernel adds up to a startup time of 410 seconds - that's nasty.

> PARALLEL_BUILD which I thought would be faster by large margin, isn't so.
> Seems like opencl compilers are still sequential. Otherwise on our side,
> this bottleneck has been eliminated.

But we can still do more, to have parallel builds actually work and run
faster now: when running with --fork, mark kernels as "being built now"
e.g. by fcntl-locking their source files under kernels/ so that other
child processes would skip them (upon trying to also fcntl-lock them and
learning of the active advisory lock).  This probably introduces the
need for an extra pass: when done building kernels from source (or when
visiting this salt again, if we were lazy-building), if any were skipped
because of the locks (presumably because those kernels were being built
by another child process), revisit those skipped kernels and build them
from the binaries (or from the source if the other child process somehow
did not build them yet, or maybe just skip them again if lazy-building).

> > is it still possible to lazy-build during cracking?
> 
> No, it is not. But do we really need it ?

This would be nice to have for when the user wants to see some results
early, yet does not want to give up on the speedup for later.

Also, why leave the GPU idle during kernel building when we can already
run the (suboptimal, non-specialized) code on it?

Thanks,

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.