|
Message-ID: <20130526184250.GA22875@openwall.com> Date: Sun, 26 May 2013 22:42:50 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Parallella: bcrypt Hi Katja, On Sun, May 26, 2013 at 08:31:15PM +0200, Katja Malvoni wrote: > I tested the performance of bcrypt on one Epiphany core. Thank you! > These are the results: > > SIZE-OPTIMIZED bcrypt implementation compiled with -O1 > Message from eCore 0x88a ( 2, 2): Result: > "$2a$05$CCCCCCCCCCCCCCCCCCCCC.E5YPO9kmyuRGyh0XouQYb4YMJKvyOeW"# > Execution time - Epiphany: 50.024000 ms [...] > ORIGINAL bcrypt implementation compiled with -O1 > Message from eCore 0x88a ( 2, 2): Result: > "$2a$05$CCCCCCCCCCCCCCCCCCCCC.E5YPO9kmyuRGyh0XouQYb4YMJKvyOeW"# > Execution time - Epiphany: 47.794000 ms These are roughly 5 times slower than they're "supposed" to be. 50 ms means 20 c/s, the speed JtR achieved at bcrypt on Pentium 120 MHz when I first implemented and optimized the assembly code for the Pentium. Each Epiphany core is similar to the original Pentium in terms of its processing power per-MHz (also dual-issue, and the original Pentium needed the loads to be done with separate instructions in a RISC-like fashion for optimal performance). However, the clock rate on the Epiphany prototypes we're using is 600 MHz, which is 5 times higher. So with optimal code, we should expect to get 10 ms and 100 c/s. This may require assembly programming, especially given that e-gcc generated code probably keeps the FPU in floating-point mode, so we're effectively using the cores as single-issue. (This problem did not exist in the original Pentium since it had two integer ALUs separate from the FPU... but this sort of design would lower the efficiency of Epiphany cores.) For now, can you try -O2 instead of -O1? > ORIGINAL bcrypt implementation using legacy.ldf > Message from eCore 0x88a ( 2, 2): Result: > "$2a$05$CCCCCCCCCCCCCCCCCCCCC.E5YPO9kmyuRGyh0XouQYb4YMJKvyOeW"# > Execution time - Epiphany: 40921.396000 ms Ouch. :-) Is this with both code and data in external RAM? > I also ran the runtime self-test and it returned correct result. Sounds good. Obviously, we need to exclude this self-test for JtR integration. JtR performs its own self-test. So your next steps may be: 1. Try -O2 and report the speed numbers in here. 2. Use all Epiphany cores, not just 1. 3. Integrate with JtR. Steps 2 and 3 may be combined. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.