Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANJ2NMNO+KLkqszzDk967KuzDmTZ62cr_nUevxf4Vz1wfxCDfA@mail.gmail.com>
Date: Tue, 26 Jun 2012 20:00:17 +0800
From: myrice <qqlddg@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: async key transfers to GPU

On Tue, Jun 26, 2012 at 5:13 PM, myrice <qqlddg@...il.com> wrote:
> On Mon, Jun 25, 2012 at 3:07 AM, Solar Designer <solar@...nwall.com> wrote:
>> myrice -
>> Yes, I did not suggest to use multiple streams.  I am not familiar with
>> this, but Lukas was able to have data transfers to GPU overlap with
>> computation on GPU by interleaving these inside crypt_all().  I am
>> suggesting an improvement upon this where you'd only need two chunks for
>> (potentially) full efficiency, whereas Lukas' inside-crypt_all()
>> approach would need more chunks to get close to full efficiency (but not
>> reach it).
>>
>> Please do try this out and post your results.
>>
>
> I split the memcpyH2D into 2. One in set_key(), one in crypt_all().
> Others remain the same.
> With 1/4 long password, there are ~4M and ~3M improvement in many
> salts and one salt.
> ======Before=============
> [12:35:02 myrice] run $ ./john -te=1 -fo=xsha512-cuda
> Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
> Many salts:     61086K c/s real, 61652K c/s virtual
> Only one salt:  17476K c/s real, 17096K c/s virtual
> ======After===============
> [12:35:53 myrice] run $ ./john -te=1 -fo=xsha512-cuda
> Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
> Many salts:     65278K c/s real, 65925K c/s virtual
> Only one salt:  20695K c/s real, 21254K c/s virtual
>

Sorry, what I just got is wrong I think. Now with many salts, only
~500K improvement and with one salt, there is ~500K reduction. Will
try split GPU call in crypt_all() and multi-stream later.

=======Before=============
Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts:     61086K c/s real, 61086K c/s virtual
Only one salt:  20695K c/s real, 20695K c/s virtual
=======After===============
Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts:     61652K c/s real, 61652K c/s virtual
Only one salt:  20164K c/s real, 20695K c/s virtual

Thanks
myrice

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.