|
Message-ID: <20130725051805.GA13040@openwall.com> Date: Thu, 25 Jul 2013 09:18:05 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Parallella: bcrypt Katja - On Thu, Jul 25, 2013 at 06:02:52AM +0400, Solar Designer wrote: > | "ldr r27, [r45], 0x1\n" \ I guess this is read from the P-box. You should be able to use ldrd here, and thus only have this instruction in every other round (a total of 9 instructions to read the 18 elements). Don't forget that ldrd needs an even-numbered first register. ... In fact, for calls to BF_encrypt() where its "start" to "end" range does not cover "P", you may load the entire P into registers before BF_encrypt()'s loop. Unfortunately, the call inside the main loop does cover both "P" and "S", so you'd have to split it in order to take advantage of this optimization (and it'd increase code size). Yet I think this is worth it. Note that "load the entire P into registers" may mean something like: BF_word P0 = ctx->s.P[0]; ... BF_word P17 = ctx->s.P[17]; and specifying those variable names as "r" (P0), etc. in the "asm" block. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.