Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120403160927.GB16518@openwall.com>
Date: Tue, 3 Apr 2012 20:09:27 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: fast hashes on GPU

myrice -

On Tue, Apr 03, 2012 at 10:34:48PM +0800, myrice wrote:
> 1) In real crack, it seems that the second time we saw a previous salt,
> we receive all salts in the file(Just as shadow file). Is this right?

Yes.

> 2) In benchmark, we only use two salts in fmt_test struct. With many salts,
> we repeat use first two (or one) salts for 0x100(256) times.

Yes.  We use two unless there's only one provided in the tests.

> So if we see
> the same salt second time, we cannot do bulk hashing.I am still thinking
> for it. Any ideas?

Yes, this is something I had overlooked when suggesting this approach to
you.  Well, actually you can do the bulk hashing there and you'll even
have correct results - but you'll only do it for two salts, which is not
exactly "many", so the benchmark result won't be very meaningful.

You may revise the condition to be keys_changed && saw_same_salt_again.
(keys_changed is set in set_key() as we discussed before.)  This
condition will be true only after bench_set_keys() has been called for a
second time, so by that point you'll have the correct number of salts
recorded (including duplicate salts, which are present in benchmarking
only).

Anyway, as I wrote to you earlier today, I don't expect any of this to
result in much speedup (assuming correctly tuned max_keys_per_crypt in
either case).

Instead, you may focus on offloading hash comparisons to GPU.

> I tested new code on GTX580(THREADS 480, BLOCKS 1024):

You could try THREADS 512 there.

> Benchmarking: Mac OS X 10.7+ salted SHA-512 CUDA [64/32]... DONE

I suggest that you put "CUDA" in place of "64/32", and remove " CUDA"
from "Mac OS X 10.7+ salted SHA-512 CUDA".

The bits_used/bits_total notation was never meant to include let's say
bits_desired in the bits_used portion. ;-)

> Many salts: 34065K c/s real, 34065K c/s virtual
> Only one salt: 23359K c/s real, 23130K c/s virtual

That's a bit slower than what I had on GTX 570 o/c (which should be of
similar speed to a GTX 580 at stock clocks).  I think some of the other
optimizations/hacks I was making is not yet in your code.  I'll test
your newer code and see.

Thanks,

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.