|
Message-ID: <4F04C136.1070103@hushmail.com> Date: Wed, 04 Jan 2012 22:14:30 +0100 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: SSE/intrinsics for sapB/sapG On 12/31/2011 11:44 AM, magnum wrote: > If we had a format that always needed n buffers we could have a GETPOS > that actually spans n key buffers, and a crypt call (or macro) that do > all of them. Then I think the fmt.c would not need to handle anything > specially. For sapG (intermediate key, up to 248 bytes = 4 limbs) I first tried to come up with a GETPOS that would do the job, but this did not work well. Here's how I did it instead: 1. Change GETPOS so we don't write past 63 bytes but start over from 0: Just change the "(i)&(0xffffffff-3))" to "(i)&60)". 2. Allocate a separate buffer for each 64-byte limb: unsigned char saved_key[4][80*4*NBKEYS]; 3. This is a sample set_key loop: while((temp = *key++) && len < PLAINTEXT_LENGTH) { saved_key[len>>6][GETPOS(len, index)] = temp; len++; } The [len>>6] will place each character in the correct buffer, the rest is just normal procedure. 4. Saved the length in the correct place: ((unsigned int*)saved_key[(len+8)>>6])[15*MMX_COEF + (index&3) + (index>>2)*80*MMX_COEF] = len << 3; Here, the (len+8)>>6 will place this length word in the right buffer. Other than that, just as usual. 5. Now, everything is set. There's nothing more to it, except for this problem: > But I guess the real problem is if *some* of the keys are shorter than > 56 bytes and some of them are longer. That is: If some - but not all - of the keys in a batch are done, they will be trashed by the next call to SSESHA1body(). I currently solve this by crypting to a temporary output buffer. If I know a particular index is "done", I copy it to the final buffer with a small inline function, in sapG it's called crypt_done(). I keep track of lengths in an array, call crypt_done() for the indexes that are done, and (only) if needed, call another crypt. No matter how we improve this, we will always have the problem that if just one key out of NBKEYS (which is typically 12), we will get what could be viewed as a "12x slowdown" instead of a "1/12x slowdown" that would happen with 1x code. But for sapG, this does not seem to be much of a problem - nearly all keys need two crypts. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.