|
Message-ID: <CA+EaD-aM3X1V_zAuR-MTsPrd0sZbYtLTUDirdJ+av+PV5vA=Sw@mail.gmail.com>
Date: Thu, 25 Jul 2013 17:13:09 +0200
From: Katja Malvoni <kmalvoni@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt
On Thu, Jul 25, 2013 at 4:38 PM, Solar Designer <solar@...nwall.com> wrote:
> I haven't looked at the new code revision yet, but it sounds like you
> misunderstood me. I was referring to the case of having all 16 rounds
> within one asm block, roughly like this:
>
> #define BF_2ROUNDS \
> "asm code string " \
> "corresponding " \
> "to two rounds here"
>
> then in the function:
>
> __asm__ (
> "some init code"
> BF_2ROUNDS
> BF_2ROUNDS
> BF_2ROUNDS
> BF_2ROUNDS
> BF_2ROUNDS
> BF_2ROUNDS
> BF_2ROUNDS
> BF_2ROUNDS
> "some final code"
> : ...
> : ...
> : ...
> );
>
> So the 2 rounds' strings would all be concatenated into one string by
> the compiler. With this, there's no room for the compiler to get too
> smart and generate the extra MOVs you mention.
>
Ok, committed now.
> BTW, you may try putting the most frequently used variables into
> registers r0 through r7, since instructions with only those low-numbered
> registers are smaller (16 bits).
>
I tried it, didn't change speed, still 976 c/s.
Katja
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.