|
Message-ID: <CAFYn=yCuO_C-ce9Wgo2ha-jdEx7Shr-m0ENLmxj-ZHiiAdngFg@mail.gmail.com>
Date: Thu, 29 Aug 2013 14:54:22 -0400
From: Yaniv Sapir <yaniv@...pteva.com>
To: john-dev <john-dev@...ts.openwall.com>
Subject: Re: Parallella: Litecoin mining
On Wed, Aug 28, 2013 at 9:37 PM, Solar Designer <solar@...nwall.com> wrote:
> Does this mean that replacing memcpy() improved the overall speed by as
> much as 15% or so? If so, this suggests that the code wastes too much
> time copying data, and needs to be revised at higher level (than memcpy()
> itself), in addition to optimizing memcpy().
>
> Also, I just took a look at your currently committed code - your
> memcpy() replacement, at least at source code level, copies data byte by
> byte. This is very slow, unless the compiler optimizes this into 32-bit
> or 64-bit loads and stores somehow. I doubt that replacing memcpy()
> with this implementation of blkcpy() provided any speedup (but I could
> be wrong - weird things happen).
>
> Ideally, your blkcpy() should be a partially unrolled loop with LDRD and
> STRD instructions in it, and all of the data needs to be 8 byte aligned.
Absolutely. The newlib's memcpy() implementation does bytes and words copy,
based on the alignment of the pointers, but no shorts or doubles.
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.