|
Message-ID: <CA+EaD-aciOW=j4e-zDj73ZgspP0Mpi8ScFnPYmYKABSzCMi-ww@mail.gmail.com>
Date: Tue, 30 Jul 2013 22:57:09 +0200
From: Katja Malvoni <kmalvoni@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: bcrypt
Hello,
I implemeted preload of second P array, code is committed. I got one
register by reusing one of temporaries and I got another one by changing
two offsets for one ptr. I'm getting 1192 c/s. I expected higher speed and
I think this is because something is not dual-issued for the second
instance second BF_ROUND in macro. At the end of the macro, load from P
array was ensuring 4 cycles separation between iadd and corresponding eor.
I still haven't figured out what is not dual-issued and why.
Katja
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.