|
Message-ID: <20130725153558.GA15090@openwall.com> Date: Thu, 25 Jul 2013 19:35:58 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Parallella: bcrypt Katja, Yaniv - On Thu, Jul 25, 2013 at 08:06:48AM +0400, Solar Designer wrote: > Maybe you'll come up with another clever/crazy idea on how to do right > shifts with Epiphany's FPU instructions (like I mentioned, replacing one > right shift with multiple FPU instructions is OK). Here's an idea: use LDRB (load byte) instead of the right shifts by 6 and 14 bits, then use IMUL or IMADD to shift left by 2 bits (emulate the non-existent index scaling on further loads, off-loading it to the FPU). I think Epiphany's ISA registers are memory-mapped, so we can use LDRB directly from the address of the register holding the L or R variable. Yaniv - is this correct? Even if not, we can do a 32-bit store and we still save 1 cycle. Right now, we have these two right shifts and two ANDs. We replace these four with one 32-bit store (if we have to, but I think we don't - see above), two LDRBs, and two IMULs or IMADDs (but these are free for us, since the FPU would otherwise be idle). So that's 3 (or 2) non-free instructions instead of 4. Yaniv - which is better: IMUL followed by simple LDR (no index) or IMADD followed by LDR with index? In other words, is it better to use the adder on the FPU or the adder in the IALU for our address calculation, when we have the choice to use either? I think the code will run at the same speed either way, but maybe there's a difference in power usage and heat production by the chip? Katja - I don't mention the right shift by 22 bits above, because this one is easily replaced with right shift by 24 and IMUL (or IMADD) as I pointed out in another message. So we avoid the AND for this one even without having to use LDRB. You may try the LDRB approach for it as well, but I think the right shift by 24 approach will result in either the same or slightly better speed (I think loads have 1 cycle greater latency, so reduce your flexibility a little bit, compared to LSR). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.