|
|
Message-ID: <7b43b6f9e32f47d3e3473317462fa7c7@smtp.hushmail.com>
Date: Fri, 17 Jul 2015 21:09:15 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Lyra2 on GPU
On 2015-07-17 20:41, magnum wrote:
> On 2015-07-17 20:03, Agnieszka Bielec wrote:
>> 2015-07-17 18:29 GMT+02:00 magnum <john.magnum@...hmail.com>:
>>> I tried building your code but it's broken for OSX:
>>
>> OS X doesn't have pthread_barrier_t but I had problems with speed on
>> lyra2-lm on super when I was using barriers in openmp, and only this
>> change made that the speed is normal, I will decide what I will do
>> with this error, if you want to test my code I recommend you to switch
>> to commit e6a532b40e4c98418913075b5407e50765f2298a because my newest
>> commit works on super on both cards but in my laptop doesn't work when
>> LWS=GWS (cmp_all(1) failed) and I don't know if this is bug in my code
>> or somewhere else. and to make my code compiling it's enough to remove
>> files whose name begin with "Lyra2"
>
> I'll try that.
>
> Perhaps you can use the pthread barrier stuff "#ifndef APPLE", with a
> fallback to OpenMP barriers. Or we could check for it in autoconf.
Several other problems. The yescrypt-opencl format use "ulong" which
doesn't exist (on host side) here. You need to use cl_ulong or uint64_t
instead. And I see lots of uses of "long" too. This should never be used
in host code - it may end up as 32-bit. You probably want to use int64_t
or cl_long for them.
The kernel build produces a boatload of warnings that you need to fix:
--8<------8<------8<------8<------8<------8<------8<----
$ ../run/john -test -form:lyra2-opencl -dev=2
Benchmarking: Lyra2-opencl [Lyra2 OpenCL (inefficient, development use
only)]... Device 2: GeForce GT 650M
Build log: <program source>:49:6: warning: no previous prototype for
function 'lyra2_initState'
void lyra2_initState(__global ulong * state)
^
<program source>:109:6: warning: no previous prototype for function
'lyra2_absorbInput'
void lyra2_absorbInput(__global ulong * memMatrixGPU,
^
<program source>:215:16: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
for (i = 0; i < nBlocksInput * BLOCK_LEN_BLAKE2_SAFE_BYTES; i++) {
~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:220:2: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
memcpy(ptrByte, ptrByteSource, inlen);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:47:32: note: expanded from macro 'memcpy'
#define memcpy(dst, src, size) gmemcpy(dst, src, size)
^~~~~~~~~~~~~~~~~~~~~~~
<program source>:43:16: note: expanded from macro 'gmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:227:2: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
memcpy(ptrByte, ptrByteSource, saltlen);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:47:32: note: expanded from macro 'memcpy'
#define memcpy(dst, src, size) gmemcpy(dst, src, size)
^~~~~~~~~~~~~~~~~~~~~~~
<program source>:43:16: note: expanded from macro 'gmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:231:2: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
glmemcpy(ptrByte, &kLen, sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:38:16: note: expanded from macro 'glmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:233:2: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
glmemcpy(ptrByte, &inlen, sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:38:16: note: expanded from macro 'glmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:235:2: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
glmemcpy(ptrByte, &saltlen, sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:38:16: note: expanded from macro 'glmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:237:2: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
memcpy(ptrByte, &(salt->t_cost), sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:47:32: note: expanded from macro 'memcpy'
#define memcpy(dst, src, size) gmemcpy(dst, src, size)
^~~~~~~~~~~~~~~~~~~~~~~
<program source>:43:16: note: expanded from macro 'gmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:239:2: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
memcpy(ptrByte, &(salt->m_cost), sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:47:32: note: expanded from macro 'memcpy'
#define memcpy(dst, src, size) gmemcpy(dst, src, size)
^~~~~~~~~~~~~~~~~~~~~~~
<program source>:43:16: note: expanded from macro 'gmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:241:2: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
memcpy(ptrByte, &(salt->nCols), sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:47:32: note: expanded from macro 'memcpy'
#define memcpy(dst, src, size) gmemcpy(dst, src, size)
^~~~~~~~~~~~~~~~~~~~~~~
<program source>:43:16: note: expanded from macro 'gmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:247:3: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
glmemcpy(ptrByte, &nPARALLEL, sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:38:16: note: expanded from macro 'glmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:251:3: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
glmemcpy(ptrByte, &thread, sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:38:16: note: expanded from macro 'glmemcpy'
for(mi=0;mi<(size);mi++) \
~~^
<program source>:300:16: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
for (i = 0; i < N_COLS; i++) {
~ ^ ~~~~~~
<program source>:349:16: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
for (i = 0; i < N_COLS; i++) {
~ ^ ~~~~~~
<program source>:388:16: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
for (i = 0; i < N_COLS; i++) {
~ ^ ~~~~~~
<program source>:424:6: warning: no previous prototype for function
'reducedDuplexRowFilling'
void reducedDuplexRowFilling(ulong * state,
^
<program source>:462:16: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
for (i = 0; i < N_COLS; i++) {
~ ^ ~~~~~~
<program source>:518:6: warning: no previous prototype for function
'reducedDuplexRowWanderingParallel'
void reducedDuplexRowWanderingParallel(__global ulong * memMatrixGPU,
^
<program source>:553:16: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
for (i = 0; i < N_COLS; i++) {
~ ^ ~~~~~~
<program source>:601:6: warning: no previous prototype for function
'absorbRandomColumn'
void absorbRandomColumn(__global ulong * in, ulong * state,
^
<program source>:633:6: warning: no previous prototype for function
'wanderingPhaseGPU2'
void wanderingPhaseGPU2(__global ulong * memMatrixGPU,
^
<program source>:778:16: warning: comparison of integers of different
signs: 'unsigned int' and 'int'
for (i = 0; i < fullBlocks; i++) {
~ ^ ~~~~~~~~~~
<program source>:788:6: warning: no previous prototype for function
'reducedDuplexRowFilling_P1'
void reducedDuplexRowFilling_P1(ulong * state,
^
<program source>:819:16: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
for (i = 0; i < N_COLS; i++) {
~ ^ ~~~~~~
<program source>:875:6: warning: no previous prototype for function
'reducedDuplexRowWandering_P1'
void reducedDuplexRowWandering_P1(__global ulong * memMatrixGPU,
^
<program source>:897:16: warning: comparison of integers of different
signs: 'int' and 'uint' (aka 'unsigned int')
for (i = 0; i < N_COLS; i++) {
~ ^ ~~~~~~
<program source>:941:6: warning: no previous prototype for function
'wanderingPhaseGPU2_P1'
void wanderingPhaseGPU2_P1(__global ulong * memMatrixGPU,
^
<program source>:1046:16: warning: comparison of integers of different
signs: 'unsigned int' and 'int'
for (i = 0; i < fullBlocks; i++) {
~ ^ ~~~~~~~~~~
memory per hash : 384.00 kB
--8<------8<------8<------8<------8<------8<------8<----
The "no previous prototype" can be avoided by always putting "static" or
"inline" before *all* non-kernel functions.
The auto tune doesn't seem to ever end using my GT650M. Using my even
weaker Intel HD4000, it just segfaults. Using the CPU device, it fails
at cmp_all(1).
The pomelo-opencl format also produces a few warnings you need to fix,
but works on the nvidia:
Benchmarking: pomelo-opencl [OpenCL (inefficient, development use
only)]... Device 2: GeForce GT 650M
Build log: <program source>:283:16: warning: unused variable 'random_number'
unsigned long random_number, index_global, index_local;
^
<program source>:283:31: warning: unused variable 'index_global'
unsigned long random_number, index_global, index_local;
^
<program source>:283:45: warning: unused variable 'index_local'
unsigned long random_number, index_global, index_local;
^
<program source>:279:19: warning: unused variable 'j'
unsigned long i, j, y, from=loop->from;
^
<program source>:495:16: warning: unused variable 'random_number'
unsigned long random_number, index_global, index_local;
^
<program source>:495:45: warning: unused variable 'index_local'
unsigned long random_number, index_global, index_local;
^
<program source>:495:31: warning: unused variable 'index_global'
unsigned long random_number, index_global, index_local;
^
<program source>:491:19: warning: unused variable 'j'
unsigned long i, j, y;
^
memory per hash : 256.00 kB
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 6400 c/s real, 95085 c/s virtual
Only one salt: 6462 c/s real, 83200 c/s virtual
It passes self-test on CPU device too but on the Intel HD4000, it fails
at cmp_all(3).
magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.