|
Message-ID: <20150704095420.GA22777@openwall.com> Date: Sat, 4 Jul 2015 12:54:20 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Lyra2 on GPU On Sat, Jul 04, 2015 at 02:04:26AM +0200, Agnieszka Bielec wrote: > I received results: > > [a@...er run]$ ./john --test --format=lyra2-opencl --dev=5 > Benchmarking: Lyra2-opencl, Lyra2 [Lyra2 Sponge OpenCL (inefficient, > development use only)]... Device 5: GeForce GTX TITAN > Local worksize (LWS) 64, global worksize (GWS) 2048 > DONE > Speed for cost 1 (t) of 8, cost 2 (m) of 8, cost 3 (c) of 256, cost 4 (p) of 2 > Raw: 6023 c/s real, 5965 c/s virtual > > [a@...er run]$ ./john --test --format=lyra2-opencl > Benchmarking: Lyra2-opencl, Lyra2 [Lyra2 Sponge OpenCL (inefficient, > development use only)]... Device 0: Tahiti [AMD Radeon HD 7900 Series] > Local worksize (LWS) 64, global worksize (GWS) 2048 > DONE > Speed for cost 1 (t) of 8, cost 2 (m) of 8, cost 3 (c) of 256, cost 4 (p) of 2 > Raw: 7447 c/s real, 51200 c/s virtual > > before optimizations speed was equal to 1k Cool. And these are much better than what you were getting with Lyra2 authors' CUDA code, right? Are these higher speeds reproducible on actual cracking runs? Please test. > my optimizations are based on transfer one table to local memory and > copying small portions of global memory into local buffers, I didn't > saw any sense i coalescing and I didn't tried it OK. Is the "copying small portions of global memory into local buffers" like prefetching? Or are those small portions more frequently accessed than the rest? In other words, why is this optimization effective for Lyra2? Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.