|
Message-ID: <20120514025144.GA7952@openwall.com> Date: Mon, 14 May 2012 06:51:44 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Blowfish (bcrypt) on CPU vs. GPU (FX-8120 vs. HD 7970) On Mon, May 14, 2012 at 05:25:02AM +0400, Solar Designer wrote: > I tried to estimate possible speeds for bcrypt (Eksblowfish) on GCN (HD > 7970), using the known speeds on Bulldozer (FX-8120) as reference (to > verify my math, as well as to see the possible speedup over CPU, if any). To make it clear: Blowfish encryption itself is implementable on GPUs efficiently. That is, multiple data streams may be encrypted or decrypted in parallel on a GPU efficiently, but only with the same key. This is described here: http://researchweb.iiit.ac.in/~rishabh_m/gpu_crypto.pdf It's Blowfish key setup that is far more difficult, because we have to maintain separate S-boxes per key. This is precisely what we need for cracking of bcrypt hashes. > We can issue up to 5 instructions per cycle per CU - apparently, this > maximum is reached with 1 scalar and 4 SIMD instructions. With four > 16-lane SIMD units, we'd normally have 64 work-items per wavefront, and > we'd have at least 10 waves/SIMD, 40 waves/CU, 2560 work-items per CU > (as per slide 11). However, the 64 KB of LDS only lets us keep up to > 16 sets of Blowfish S-boxes in it (we need 4 KB per set). So maybe we > can/should only run 16 work-items per CU, thus making use of only 1/4 of > total available lanes and incurring stalls on data dependencies after > high-latency instructions. Relevant question/answers: http://devgurus.amd.com/thread/159171 Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.