|
Message-ID: <20120710023536.GA5020@openwall.com> Date: Tue, 10 Jul 2012 06:35:36 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Rotate and bitselect investigation Sayantan, magnum - On Tue, Jul 10, 2012 at 07:19:38AM +0530, Sayantan Datta wrote: > On Tue, Jul 10, 2012 at 7:00 AM, Solar Designer <solar@...nwall.com> wrote: > > Maybe we should use the same approach that magnum uses in rar_kernel.cl: > > > > #ifdef cl_nv_pragma_unroll > > #define NVIDIA > > #endif > > [...] > > #ifdef NVIDIA > > #define F(x,y,z) (z ^ (x & (y ^ z))) > > #else > > #define F(x,y,z) bitselect(z, y, x) > > #endif > > > > This won't detect CPUs, though - where we also don't want to use > > bitselect() most of the time (the instruction is only available with XOP > > and is probably not used by current OpenCL SDKs since I think only > > Intel's does vectorization) - but this code is mostly just for AMD and > > NVIDIA GPUs now. We have faster MSCash2 code on CPU anyway. > So we should use manual bitselect by default. But we don't have a trick similar to cl_nv_pragma_unroll that would let us detect AMD GPUs. So I am fine with us using bitselect() by default and only disabling it on NVIDIA, unless/until we learn of a trick to detect AMD GPU in OpenCL (or introduce such way by passing the info from our C code). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.