|
Message-ID: <20121213014949.GA11207@openwall.com> Date: Thu, 13 Dec 2012 05:49:49 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: GCN: indexed access to VGPRs On Mon, Dec 10, 2012 at 03:44:36AM +0400, Solar Designer wrote: > On Sun, Dec 09, 2012 at 02:38:18PM +0200, Milen Rangelov wrote: > > Perhaps though, smaller chunk of the sbox in VGPRs would be beneficial, I > > just did not try that possibility. > > We'd have an if/else then - and if it's implemented with eager > execution, then we incur the LDS access latency even when the data is in > fact in a register. What we gain is a slightly higher number of > concurrent bcrypt instances per CU (18 instead of 16 if we put one half > of one S-box into registers?) I've experimented with this a bit, based on Sayantan's code. When using a "? ... :", I got a local maximum at 176 elements in the private array. However, in absolute terms the speed is poor (much worse than LDS-only). When trying to use 128 elements (one half of S-box 4) and bitselect(), self-test fails on 7970 - but works fine on GTX 570 (slow). I tried re-arranging the code in various ways, but no luck. Seems like we're hitting some AMD bug. Will need to try again after upgrading Catalyst on bull (is there a newer version of the SDK too?) I find it unlikely that we'll see any performance gain from this, though. The important lines are: tmp1 = L & 0x7f; \ tmp1 = bitselect(Sptr4[tmp1], S4_2[tmp1], (uint) -(int) !!(L & 0x80)); \ and there are alternatives to them in the #if 0 ... #endif block, e.g.: tmp1 = L & 0x7f; \ tmp1 = bitselect(Sptr4[tmp1], S4_2[tmp1], (uint)((int)(L & 0x80) << 24 >> 31)); \ (also tested on GTX 570, works fine there). (Oh, I just realized that in the version with shifts, I don't need the "& 0x80" since those 7 bits would be shifted out by the right shift.) On the 7970, the versions with "? ... :" work, but all of those that use a bitmask mysteriously fail. Alexander View attachment "bf_kernel.cl.diff" of type "text/plain" (4921 bytes) View attachment "bf_kernel.cl" of type "text/plain" (9367 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.