|
Message-ID: <20150816124820.GB20969@openwall.com> Date: Sun, 16 Aug 2015 15:48:20 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Argon2 on GPU On Sun, Aug 16, 2015 at 02:01:38PM +0200, Agnieszka Bielec wrote: > now I was digging in argon2d ( I discovored that this bug occurs after > commit 9e96f452350c0f2cae32b38e4a4cd1f83d51a367) > and before this commit was code: > > bi = prev_block_offset = ((prev_slice * lanes + pos.lane + 1) * > segment_length - 1) * BLOCK_SIZE; > for (i = 0; i < 64; i++) > { > prev_block[i] = *(__global ulong2 *) (&memory[bi]); > bi += 16; > } > > slowdown on AMD occurs when I changed this code to: > > bi = prev_block_offset = ((prev_slice * lanes + pos.lane + 1) * > segment_length - 1) * BLOCK_SIZE / 16; > for (i = 0; i < 64; i++) > { > prev_block[i] = ((__global ulong2*)memory)[bi+i]; > } > > see anyone some logic here or is this just a bug on AMD? Why do you call this a bug? It isn't necessarily a bug when performance of code changes when you change the source code. Anyway, it looks like in the second code version you rely on address scaling by 16, and this is probably not available in the architecture (usually available is scaling by up to 8), so requires extra instructions (explicit left shifts). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.