|
Message-ID: <42590d8718c9be0ded34b13ec257ba88@smtp.hushmail.com> Date: Sat, 24 Mar 2012 23:24:40 +0100 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: OpenCL vector tactics Could someone brave tell me why/how/if to use vector types like uint4 in OpenCL kernels? I do understand sse2 intrinsics, like sse-intrinsics.c in john, but as far as I understand, that does not translate well to OpenCL. I don't quite get how to apply vectors to OpenCL. How would I attack it? Would each kernel do four inputs and produce four outputs? If I have an existing kernel that does one input and ends up with one output, should I just convert it to do four things at a time for every move it makes, just like sse2 intrinsics? Just like that? And if so, what would I set as local workgroup size? The real size divided by four? And what would this accomplish? Did I understand right that this would benefit CPUs and perhaps AMD GPUs but likely not nvidia? Would it be detrimental for nvidia? magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.