|
Message-ID: <20150723020003.GB2172@openwall.com> Date: Thu, 23 Jul 2015 04:00:03 +0200 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: yescrypt on GPU On Thu, Jul 23, 2015 at 01:33:26AM +0200, magnum wrote: > On 2015-07-23 00:36, Agnieszka Bielec wrote: > >has anyone idea why copying parts of memory from __global to __private > >makes my code slower when there are different passwords and faster > >where all passwords are the same? Why faster for same passwords: This is puzzling, but my guess (which could well be wrong) is that the remaining global memory accesses have better locality of reference (resulting in better cache hit rate) and/or coalescing potential than all of them did before you moved some to private memory. In other words, you moved the "bad" ones to private and kept the "good" ones in global. But they are only "good" when the passwords are the same (and I guess the salts as well, or there are few different ones), so this is of no practical use. Why slower for different passwords: I guess your LWS or/and GWS became lower. > >I did in lyra2 something very > >similar, maybe my code is too big and I have to do split kernels? Split kernel may be good anyway, but this is most likely unrelated to this specific occasion. > Are there differences in length distribution in the two cases? This should be irrelevant. The PHC finalists process the plaintext password into a hash early on, and do not use the plaintext password frequently. They are not like e.g. md5crypt in this respect. > If not, > Maybe in the slow case they end up spilling to local memory due to > harder register pressure. Maybe. This is a possibility with any changes to a kernel. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.