|
Message-ID: <CABh=JRHNCn1xxMfYOYNj2EhLgLituT_-Apa7x-yS_gDW2NE=oQ@mail.gmail.com>
Date: Sun, 27 Jan 2013 00:35:01 +0200
From: Milen Rangelov <gat3way@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Proposed optimizations to pwsafe
Just a side note, I just had a look at your opencl pwsafe code and there
are obvious optimizations that can be done. Some are minor, but the most
important is the following. You have this:
#define Ch(x, y, z) (z ^ (x & (y ^ z)))
#define Maj(x, y, z) ((y & z) | (x & (y | z)))
If you replace those by:
#define Ch(x,y,z) (bitselect(z,y,x))
#define Maj(x,y,z) (bitselect(y, x,(z^y)))
You are effectively using just 1 ALU operation per Ch as compared to 3 and
2 ALU ops per Maj as compared to 4.
You've got 80 steps per SHA256 block operation, so you save 360 ALU ops per
SHA256. bitselect is mapped to the hardware instruction BFI_INT. This is
applicable to amd hardware only, not nvidia.
Hope that helps :)
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.