|
Message-ID: <CAJAsdNiAvmE9j1an__MNKKSPKzyQ+e=TxnbjfiJTdO4j0e8Exg@mail.gmail.com>
Date: Tue, 18 Jun 2013 16:58:07 +0200
From: Dániel Bali <balijanosdaniel@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: sha3-opencl
Hello!
Here are some results of the profiler for the keccak256 OpenCL kernel.
The results are for a Turks (AMD Radeon HD 7600M Serie) GPU on which I had
919K c/s (real) performance.
Here is some explanation for what the values mean:
http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-app-profiler/user-guide/app-profiler-session/
http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-app-profiler/user-guide/app-profiler-settings/
VGPRs: 43
ScratchRegs: 23 ("If non zero, this is typically the main bottleneck. To
reduce this number, reduce the number of GPRs used by the kernel.")
KernelOccupancy: 15.625 (The limiting factor is the # of VGPRs available)
ALUBusy (%): ~16 (This is bad)
ALUPacking (%): 73 (This is okay, could be better)
CacheHit (%): 0 (No caching happens)
PathUtilization (%): 100
Another very useful feature is the kernel analyzer which shows statistics
for different architectures. Here is what it shows for Tahiti and Turks in
comparison:
(Tahiti / Turks)
ScratchRegs: 0 / 23
MaxVGPRs: 256 / 248
VGPRs: 199 / 43
This means that we aren't bottlenecked by ScratchRegs on Tahiti.
What's strange is that even though Turks should allow 248 VGPRs the kernel
only uses 43 in practice.
Regards,
Daniel
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.