Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <164d201ea9ecbf819d3b7619b33f3433@smtp.hushmail.com>
Date: Mon, 06 Apr 2015 19:42:34 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: New SIMD generations, code layout

On 2015-04-06 15:38, Lei Zhang wrote:
>> On Apr 6, 2015, at 8:33 PM, magnum <john.magnum@...hmail.com> wrote:
>> It looks good. What failure do you get? Your version fails even
>> with AVX2, with "FAILED (cmp_all(5))". That message means keys 0..4
>> were set, crypt_all(5) was called and them cmp_all(5) which did not
>> indicate anything was cracked. So everything worked correctly up to
>> 4, but 5th failed.
> 
> I find the problem to be with sha1_fmt_cmp_all

> In the original code, the stride (between two vtesteq_epi32s) is
> fixed to 4. I think I should adjust the stride according to the SIMD
> width, so I modify the code as how it looks now. And as you
> mentioned, the new code fails even on AVX2. I just tried to revert it
> back to use the fixed stride of 4, and then it passed the self-test
> on AVX2, which is strange. I don't know why the stride isn't
> adjustable. And I can't try that fixed stride on MIC, because it
> won't guarantee the 64-byte alignment required by MIC.
> 
> Any thoughts?

Hmm we do the correct thing in sha1_fmt_binary() don't we? That is what
later will be B in cmp_all(). I looks right to me... No! Here is the
problem and it's my bug. Look at that code:

	// One preprocessing step, if we calculate E80 rol 2 here, we
	// can compare it against A75 and save 5 rounds in crypt_all().
#if VWIDTH > 4
#if VWIDTH > 8
	result[15] = result[14] = result[13] = result[12] =
	result[11] = result[10] = result[9] = result[8] =
#endif
	result[7] = result[6] = result[5] =
#endif
	result[3] = result[2] = result[1] = result[0] =
		rotateleft(__builtin_bswap32(result[4]) - 0xC3D2E1F0, 2);

Because of the byte-swap we need to do it like this:

  	// One preprocessing step, if we calculate E80 rol 2 here, we
  	// can compare it against A75 and save 5 rounds in crypt_all().
  #if VWIDTH > 4
  #if VWIDTH > 8
  	result[15] = result[14] = result[13] = result[12] =
  	result[11] = result[10] = result[9] = result[8] =
  #endif
- 	result[7] = result[6] = result[5] =
+ 	result[7] = result[6] = result[5] = result[4] =
  #endif
  	result[3] = result[2] = result[1] = result[0] =
  		rotateleft(__builtin_bswap32(result[4]) - 0xC3D2E1F0, 2);


magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.