|   | 
| 
 | 
Message-ID: <20150427015007.GB27289@openwall.com> Date: Mon, 27 Apr 2015 04:50:07 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: [GSoC] John the Ripper support for PHC finalists On Sat, Apr 25, 2015 at 11:53:15PM +0200, Agnieszka Bielec wrote: > 2015-04-25 23:33 GMT+02:00 Solar Designer <solar@...nwall.com>: > > Oh, maybe it happens to work precisely because those 4 work-items tend > > to be part of the same SIMD vector in hardware, so with current OpenCL > > drivers they happen to be "guaranteed" to be ready at the same time? > > Did you mean CPU, not GPU?? No, I meant GPU. Why? Your code looks broken regardless of target device, even if it happens to work on some devices currently. > Sorry I forgot to mention that it works on > GPU because instructions are executed at the same time but this fails > on CPU. ... and this confirms that the code is broken. You're relying on things that are not guaranteed, not even on GPU. > I was focused only on GPU because coalescing decreased the speed on > CPU significantly Why did it? I think you might be misinterpreting the reasons for speedup and slowdown with these changes. While coalescing is relevant, it is not the only thing that changes. When you use 4-way SIMD, you take advantage of the parallelism available within one instance of POMELO. This might make a lower GWS optimal (now that you've tried using the proper vector type), and along with it lower GPU global memory usage. Unless and until you actually bump into the total GPU global memory size with a would-be-optimal GWS, this aspect might not matter (except that caching is slightly more effective when the total size isn't as much larger than what can be cached), but you should keep it in mind anyway. BTW, bumping into total GPU global memory size may be realistic with these memory-hard hashes. Our TITAN's 6 GB was the performance limiting factor in some of the benchmarks here: http://www.openwall.com/lists/crypt-dev/2014/03/13/1 You could want to list GPU memory usage along with your benchmark results, so that we can assess how close or not we're getting to there. > and if somebody want opencl on CPU they should use a old version OK, but we're also speaking code correctness here. Except for today's research purposes, we're not interested in code that just happens to work today and is expected to break any time. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.