|
Message-ID: <8cf9a2b5b9d0ff12683d25bf3db392f8@smtp.hushmail.com> Date: Mon, 23 Apr 2012 03:07:39 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: New RAR OpenCL kernel On 04/23/2012 12:02 AM, Claudio André wrote: > >> Would both these figures by closer to 100 in a dream scenario, or what? >> >> By the way my previous version of rar got an "occupancy" of 0.01 or so >> (lol) in nvidia profiler. We'll see if there is any change now. >> >> magnum >> > I like the "dream scenario". Valid explanation. And 100 is the target. > > Alu packing has a "> 70" expectation. > Alubusy is where 100% is optimal. > > I agree that sprofile is not very useful, but is better than nothing (or > simple guessing). Since you have NVIDIA tools, it is not that important. I think sprofile is useful, it's just that my laptop GPU is so weak I can't draw any conclusions. Your profiling info was with LWS=GWS. Please try this if you have the time: 1. Pull latest git 2. Run with KPC=0 (I expect it to pick 4096 or higher as best) 3. Do another profiling run with the best KPC The ALU figures (and speed) should go up a lot (I hope). If they are not, the profiling info should tell why. thanks, magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.