Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51c80e7f232a45b9c78982c95828c583@smtp.hushmail.com>
Date: Thu, 22 Mar 2012 08:00:54 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: New patch for OpenCL SHA-512

On 03/22/2012 04:32 AM, Solar Designer wrote:
> On Mon, Mar 19, 2012 at 08:53:00AM -0300, Claudio Andr? wrote:
>> For example of bad things in GPU code, data dependent decisions and
>> paths (not avoidable, right?????????):
> 
> A way to avoid them is to speculatively follow both code paths, then
> pick the right result with bitwise ops (bitselect() may help here).
> Whether this change would result in a speedup or slowdown depends on
> relative costs of the extra computation vs. branching.
> 
> BTW, speculative execution is commonly done by CPUs even when you do use
> branch instructions, although CPUs tend to do it for one of the code
> paths only (predicted-taken).  What I am proposing here is called "eager
> execution" in the Wikipedia article below:
> 
> http://en.wikipedia.org/wiki/Speculative_execution
> 
> "Eager execution is a form of speculative execution where both sides of
> the conditional branch are executed, however the results are committed
> only if the predicate is true.  With unlimited resources, eager execution
> (also known as oracle execution) would in theory provide the same
> performance as perfect branch prediction.  With limited resources eager
> execution should be employed carefully since the number of resources
> needed grows exponentially with each level of branches executed
> eagerly."

I believe both AMD's and Nvidia's compilers generate code for eager
execution in many cases (that is, unless the conditional blocks are too
large or too complex). I don't know much details but I suppose the real
problems begin when they do not: Normally all threads in a workgroup
execute in sync.

BTW I haven't yet found a good (easy) way to profile my OpenCL code. I'd
like to do it on CPU as well as Nvidia GPU, running Ubuntu/Mint. Any
suggestions welcome.

magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.