|
Message-ID: <20150529075609.GB25177@openwall.com> Date: Fri, 29 May 2015 10:56:09 +0300 From: Solar Designer <solar@...nwall.com> To: Alain Espinosa <alainesp@...ta.cu> Cc: john-dev@...ts.openwall.com Subject: Re: bitslice SHA-256 On Fri, May 29, 2015 at 01:22:10AM -0400, Alain Espinosa wrote: > ...I briefly experimented with merged ADDs in this md5slice.c revision > > I will take a look. > > ...add32c() is a 3-input ADD where one of the inputs is a constant > > I check this code searching how to reduce sum instructions count. If I understand it correctly you use more than 5 for one add (more than 10 for 2, if I recall correctly you use 11). My add32() appears to use 5 (not counting the loads and the store): a = *x++; b = *y++; *z++ = (p = a ^ b) ^ c; c = (p & c) | (a & b); But you're right - my add32c()'s code path when the constant has a 1 bit uses 11 (with XNOR) or 12 (without). This feels wrong, and there got to be a way to optimize this to 10 or less within the same instruction set. Its code path for when the current constant bit is 0 has only 7 operations, though - so this demonstrates how the addition of a constant can be cheaper than of a variable: a = *x++; b = *y++; if (c & 1) { *z++ = ~(a ^ b) ^ c1 ^ c2; c2 = (a & b & (p = c1 | c2)) | (c1 & c2 & (q = a | b)); c1 = p | q; } else { *z++ = (q = (p = a ^ b) ^ c1) ^ c2; c1 = (p & c1) | (a & b); c2 &= q; } Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.