john-dev - Re: Parallella: Litecoin mining

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130726231818.GC24959@openwall.com>
Date: Sat, 27 Jul 2013 03:18:18 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Parallella: Litecoin mining

Rafael, Yaniv -

On Fri, Jul 26, 2013 at 11:26:48PM +0100, Rafael Waldo Delgado Doblas wrote:
> Can I load the binary executable image in just one core
> memory area and run it from the rest of the cores?
> 
> If we can do that the memory map will looks something like:
> 
> 0-31K for V on first core
> 32-63K for V on second core
> ...
> 416-447K for V on fourteenth core
> 448-479K for V on fifteenth core
> 
> 480-?? for binary executable image
> ??-??? for XY B first core.
> ....
> ??-??? for XY B fifteenth core.
> 
> With this approach only 15 of 16 cores can be used because there is no
> memory for the sixteenth core.

Please don't do this.  The B's, the XY's, and the code are the most
frequently accessed - more so each individual element of V's.

Please just increase the TMTO factor enough that everything fits in each
core's own local memory.  Then, once you get things working e.g. with a
TMTO factor of 8 (should be easy), you can try optimizing sizes of
arrays other than V, of your code, and of the stack, and locations of
all of these, so that you'd reduce the TMTO factor somewhat - ideally to 5
(25.6 KB for V) or 6 (21.3 KB for V).

A TMTO factor of 5 will probably result in some bank conflicts between
some of the V accesses and code running from the same bank (for a 1.6 KB
portion of V), so it may or may not be any faster than a factor of 6.

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.