|
Message-ID: <25a875cd326b609d2c903229d21e1ae0@smtp.hushmail.com> Date: Sat, 21 Apr 2012 00:45:56 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: cl_khr_byte_addressable_store Then I'm afraid you lost me. Just how should I approach this? Should I do two separate kernels or should I try some kind of bit-flipping madness that just might work on both AMD and nvidia? magnum On 04/21/2012 12:23 AM, Milen Rangelov wrote: > No. accessing uchar4 arrays would generate compiler error if you're not > using the extension, eg __local uchar4 arr[4];arr[1]=(1,2,3,4) would not > compile without the extension. Otherwise I believe you can have __private > uchar4 non-array variables and access them. But for RAR kernel you'd have > to use an ucharN array anyway. > > On Sat, Apr 21, 2012 at 12:34 AM, magnum <john.magnum@...hmail.com> wrote: > >> On 04/20/2012 09:59 PM, Milen Rangelov wrote: >>> Well especially for RAR on AMD, I had several attempts around that idea >> and >>> they ended much slower than the vectorized, bitwise magic version. But >> you >>> should leave it just because 4xxx is not supported. I know sometimes it's >>> hard and it could get VERY UGLY (my rar kernel is frightening). Nvidia >> may >>> have no problems with it, but AMD is not the case.. >> >> Just to get things straight in my sore head: If I vectorize the lot and >> use uchar4, I do not need byte_addressable_store, is that right? >> >> magnum >> >> >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.