|
Message-ID: <20150406133654.GA12722@openwall.com> Date: Mon, 6 Apr 2015 16:36:54 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: [GSoC] John the Ripper support for PHC finalists Hi Agnieszka, Your e-mail quoting is still really weird. It's mostly correct, but just weird. It looks like you're doing it manually. Is this so? Normally, a mail client (MUA) you'd use would take care of the (initial) quoting for you (and then you just delete portions that you don't need quoted in a particular reply). For example, Mutt does it for me. On Mon, Apr 06, 2015 at 02:40:45PM +0200, Agnieszka Bielec wrote: > I'm including tests for __global memory and __private > > I've added some printfs to know how many memory is used > > http://pastebin.com/Rqe5yKsH Please avoid using pastebin in your mailing list postings. In this case, the tests output is small enough that you could attach it as a text file to your e-mail instead. > I'm wondering why on --dev=2 opencl using > global memory was fast, ~ 150k That's because --dev=2 (and =3) is the CPUs. There's no easy way (nor do we want it, most of the time) for OpenCL driver to bypass use of CPU caches. So when you run an OpenCL kernel on CPUs, there is not supposed to be a (significant, if any at all) speed difference between local and global memory (it's the same memory subsystem anyway, consisting of caches and RAM). Usually, this results in the same CPU instructions, possibly with (unimportant) differences in specific memory addresses (but the addresses are virtual anyway, and the memory is cacheable anyway), in (non-)use of prefetch instructions, and in instruction scheduling (in case the compiler has different expectations for latencies depending on whether you specified something as being local or global). Chances are that those differences have small or negligible effect on performance (and it is unclear in which direction). Similarly, when you access global memory on GPUs, some limited in-GPU caching may nevertheless go on. It's just that on GPUs those caches are separate from local memory, and GPUs' local memory may only be addressed explicitly (so you need to explicitly use it from your OpenCL kernels), whereas CPUs don't actually have (explicitly addressable) local memory (they only have caches). I expect that others (magnum, Frank?) will reply to the rest of your message. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.