|
Message-ID: <CAAepdCY0eu6P26447xnKJAiPvRTFPHJbbMxMRQGhZprCmXfRag@mail.gmail.com>
Date: Wed, 4 Sep 2013 04:54:55 +0100
From: Rafael Waldo Delgado Doblas <lord.rafa@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: Litecoin mining
Hello Alexander,
2013/9/4 Rafael Waldo Delgado Doblas <lord.rafa@...il.com>
> In addition as you asked this is the work that I perform today:
>
> I implemented a couple salsa20_8 asm versions:
> The first one with bucles "Bor[i] = Bout[i] = (B[i] ^ Bx[i]);" and
> "Bout[i] += Bor[i];" rolled and using the instruction imadd, it save about
> 250B but the performance drops almost 0.5khash/s
> The second keep unrolled the bucles and uses the instruction imadd, it
> save only 50B but the performance also drops almost 0.5khash/s.
>
> At this point looks like imadd instrucction it so slow to be used but roll
> the bucle could be nice.
>
BTW I was debugging with e-gdb and I found that even If I use the imadd
instruction the bynary image uses fmadd, this is not nice at all because
use floating point math in this real scenario:
r44 0x80 128 1.79366203e-43
r61 0x1e 30 4.20389539e-44
r60 0x0 0 0
fmadd r60,r61,r44
Give the next erroneous result:
r44 0x80 128 1.79366203e-43
r61 0x1e 30 4.20389539e-44
r60 0x0 0 0
Instead of the correct one:
r44 0x80 128 1.79366203e-43
r61 0x1e 30 4.20389539e-44
r60 0xF00 3840 ???????
There any way to use integer math? because if not there no way to get an
improvement with this instruction.
Regards,
Rafael.
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.