Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130725051805.GA13040@openwall.com>
Date: Thu, 25 Jul 2013 09:18:05 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt

Katja -

On Thu, Jul 25, 2013 at 06:02:52AM +0400, Solar Designer wrote:
> |         "ldr r27, [r45], 0x1\n" \

I guess this is read from the P-box.  You should be able to use ldrd
here, and thus only have this instruction in every other round (a total
of 9 instructions to read the 18 elements).  Don't forget that ldrd
needs an even-numbered first register.

... In fact, for calls to BF_encrypt() where its "start" to "end" range
does not cover "P", you may load the entire P into registers before
BF_encrypt()'s loop.  Unfortunately, the call inside the main loop does
cover both "P" and "S", so you'd have to split it in order to take
advantage of this optimization (and it'd increase code size).  Yet I
think this is worth it.

Note that "load the entire P into registers" may mean something like:

	BF_word P0 = ctx->s.P[0];
...
	BF_word P17 = ctx->s.P[17];

and specifying those variable names as "r" (P0), etc. in the "asm" block.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.