|
Message-ID: <CAKGDhHWcMUQW-Bhj+_qfDmbWF35OYM8nb0tJ0CyyUzSJ6gQB0w@mail.gmail.com> Date: Mon, 24 Aug 2015 15:42:30 +0200 From: Agnieszka Bielec <bielecagnieszka8@...il.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Argon2 on GPU 2015-08-24 4:28 GMT+02:00 Solar Designer <solar@...nwall.com>: > On Mon, Aug 24, 2015 at 01:52:35AM +0200, Agnieszka Bielec wrote: >> 2015-08-23 8:15 GMT+02:00 Solar Designer <solar@...nwall.com>: >> > While private memory might be larger and faster on specific devices, I >> > think that not making any use of local memory is wasteful. By using >> > both private and local memory at once, we should be able to optimally >> > pack more concurrent Argon2 instances per GPU and thereby hide more of >> > the various latencies. >> >> why will we pack more argon2 per gpu using both types of memory? >> I'm using only very small portions of private memory. > > You're using several kilobytes per instance - that's not very small. > > If not this, then what is limiting the number of concurrent instances > when we're not yet bumping into total global memory size? For some of > the currently optimal LWS/GWS settings, we're nearly bumping into the > global memory size, but for some (across the different GPUs, as well as > 2i vs. 2d) we are not. And even when we are, maybe a higher LWS would > improve performance when we can afford it. the second option is that we reached the point when after increasing gws number, we can't get more access to global memory and most of work-items are waiting for memory. argon2i is coaelesced and it can run using more gws than argon2d,
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.