john-users - Re: nVidia Maxwell support (especially descrypt)?

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <12adbc895485811d88b8f927fac3b90d@smtp.hushmail.com>
Date: Thu, 8 Oct 2015 00:33:44 +0200
From: magnum <john.magnum@...hmail.com>
To: john-users@...ts.openwall.com
Cc: Roman Rusakov <rusakovster@...il.com>, deeplearningjohndoe@...il.com
Subject: Re: nVidia Maxwell support (especially descrypt)?

On 2015-10-07 23:37, Solar Designer wrote:
> DeepLearningJohnDoe - thank you for your work in this area, and we'd
> appreciate any comments you might have on the below.
>
> On Wed, Oct 07, 2015 at 06:54:20PM +0200, magnum wrote:
>>> On Wed, Oct 7, 2015 at 8:44 AM, Solar Designer <solar@...nwall.com> wrote:
>>>> And of course we'll also need to include some LOP3.LUT S-boxes.
>>>> If Roman's are still unreleased (except for S4), then Janet's.
> [...]
>> I implemeted this in 9c82bcc, using DeepLearningJohnDoes's (a.k.a
>> Janet's) S-boxes except for s4.
>
> Are you getting better speeds with Roman's S4?

I couldn't spot any difference - I used it just on principle, for the 
lower gate count :-)

> BTW, our current opencl_sboxes.h defaults to using nonstd.c derived
> expressions when !HAVE_LUT3. Maybe it should also have an option for
> using sboxes-s.c derived expressions, which are supposed to be faster on
> AMD GPUs.

Those files also need a general clean-up. They are hard to follow. I 
don't want to mess too much with them since Sayantan "owns" them but 
they contain large sections of commented-out stuff, CPU-specific stuff 
and so on.

>> BTW we now also use LOP3.LUT for many MD4, MD5 and SHA-2 OpenCL formats.
>> Some driver bug prevented me for using it in SHA-1 with nvidia 352.39
>> (the code is there, just disabled) and md5crypt disable it because of
>> performance regression (still to be investigated). Some formats show a
>> fine boost but none as much as DEScrypt.
>
> ... with our guess on why lower boost being that LOP3.LUT was often
> used anyway, introduced in the PTX to ISA translation.

Right - actually I did not expect any boost, I just did it because I 
could. But I'm puzzled by the md5crypt regression, and the SHA-1 problem 
(fails self-test). Perhaps the latter should be reported to nvidia, I 
think they would actually look into it.

magnum

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.