|
Message-ID: <20120501002912.GA9336@openwall.com> Date: Tue, 1 May 2012 04:29:12 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Myrice: Weekly Report #3 myrice - On Mon, Apr 30, 2012 at 10:24:23PM +0800, myrice wrote: > Priorities: > 1. Support longer passwords on XSHA512 format without performance lose > 2. Anaysis performance of stroing salts in GPU. > 3. Keep discussing password generation > - I have seen Solar and Frank's posts. I will join the discussion > ASAP. Thank you! Can you please add the following tasks to the above list? - 4. Get xsha512-opencl to work on the HD 7970 in bull. Currently, the version in magnum-jumbo fails as follows: user@...l:~/john/magnum-jumbo/src$ ../run/john -te -fo=xsha512-opencl -pla=1 OpenCL platform 1: AMD Accelerated Parallel Processing, 2 device(s). Using device 0: Tahiti Compilation log: LOOP UNROLL: pragma unroll (line 174) Unrolled as requested! LOOP UNROLL: pragma unroll (line 179) Unrolled as requested! LOOP UNROLL: pragma unroll (line 194) Unrolled as requested! Local work size = 256 Global work size = 2097152 Benchmarking: Mac OS X 10.7+ salted SHA-512 [OpenCL]... FAILED (get_hash[0](0)) It works on GTX 570 and on CPU fine. :-) In fact, on CPU it is only slightly slower than the OpenSSL/OpenMP code, which is impressive. 5. Figure out why xsha512-cuda is significantly slower on the GTX 570 in bull (1600 MHz) than it is on your GTX 580, and/or correct this. The difference should be about 5% in favor of the GTX 580 (which magnum achieved in his RAR code): 512*1544 = 790528 480*1600 = 768000 768000/790528 = 0.9715 Well, maybe even 3%, although there's also the memory bus width difference (384 vs. 320 bits) and the PCIe speed difference (the slot in bull runs at x8 because I also have the HD 7970 installed and this motherboard can only do x8 to each in such config). The bandwidth differences should not affect the "many salts" case significantly, though. So there's probably something else. Currently, xsha512-cuda from magnum-jumbo with PLAINTEXT_LENGTH changed to 12 does only: user@...l:~/john/magnum-jumbo/src$ ../run/john -te -fo=xsha512-cuda Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE Many salts: 36854K c/s real, 37058K c/s virtual Only one salt: 17625K c/s real, 17625K c/s virtual whereas you reported getting 70M c/s, and I was previously getting over 50M c/s with my own hacks of your older code. Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.