Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <12adbc895485811d88b8f927fac3b90d@smtp.hushmail.com>
Date: Thu, 8 Oct 2015 00:33:44 +0200
From: magnum <john.magnum@...hmail.com>
To: john-users@...ts.openwall.com
Cc: Roman Rusakov <rusakovster@...il.com>, deeplearningjohndoe@...il.com
Subject: Re: nVidia Maxwell support (especially descrypt)?

On 2015-10-07 23:37, Solar Designer wrote:
> DeepLearningJohnDoe - thank you for your work in this area, and we'd
> appreciate any comments you might have on the below.
>
> On Wed, Oct 07, 2015 at 06:54:20PM +0200, magnum wrote:
>>> On Wed, Oct 7, 2015 at 8:44 AM, Solar Designer <solar@...nwall.com> wrote:
>>>> And of course we'll also need to include some LOP3.LUT S-boxes.
>>>> If Roman's are still unreleased (except for S4), then Janet's.
> [...]
>> I implemeted this in 9c82bcc, using DeepLearningJohnDoes's (a.k.a
>> Janet's) S-boxes except for s4.
>
> Are you getting better speeds with Roman's S4?

I couldn't spot any difference - I used it just on principle, for the 
lower gate count :-)

> BTW, our current opencl_sboxes.h defaults to using nonstd.c derived
> expressions when !HAVE_LUT3. Maybe it should also have an option for
> using sboxes-s.c derived expressions, which are supposed to be faster on
> AMD GPUs.

Those files also need a general clean-up. They are hard to follow. I 
don't want to mess too much with them since Sayantan "owns" them but 
they contain large sections of commented-out stuff, CPU-specific stuff 
and so on.

>> BTW we now also use LOP3.LUT for many MD4, MD5 and SHA-2 OpenCL formats.
>> Some driver bug prevented me for using it in SHA-1 with nvidia 352.39
>> (the code is there, just disabled) and md5crypt disable it because of
>> performance regression (still to be investigated). Some formats show a
>> fine boost but none as much as DEScrypt.
>
> ... with our guess on why lower boost being that LOP3.LUT was often
> used anyway, introduced in the PTX to ISA translation.

Right - actually I did not expect any boost, I just did it because I 
could. But I'm puzzled by the md5crypt regression, and the SHA-1 problem 
(fails self-test). Perhaps the latter should be reported to nvidia, I 
think they would actually look into it.

magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.