|
Message-ID: <7b43b6f9e32f47d3e3473317462fa7c7@smtp.hushmail.com> Date: Fri, 17 Jul 2015 21:09:15 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Lyra2 on GPU On 2015-07-17 20:41, magnum wrote: > On 2015-07-17 20:03, Agnieszka Bielec wrote: >> 2015-07-17 18:29 GMT+02:00 magnum <john.magnum@...hmail.com>: >>> I tried building your code but it's broken for OSX: >> >> OS X doesn't have pthread_barrier_t but I had problems with speed on >> lyra2-lm on super when I was using barriers in openmp, and only this >> change made that the speed is normal, I will decide what I will do >> with this error, if you want to test my code I recommend you to switch >> to commit e6a532b40e4c98418913075b5407e50765f2298a because my newest >> commit works on super on both cards but in my laptop doesn't work when >> LWS=GWS (cmp_all(1) failed) and I don't know if this is bug in my code >> or somewhere else. and to make my code compiling it's enough to remove >> files whose name begin with "Lyra2" > > I'll try that. > > Perhaps you can use the pthread barrier stuff "#ifndef APPLE", with a > fallback to OpenMP barriers. Or we could check for it in autoconf. Several other problems. The yescrypt-opencl format use "ulong" which doesn't exist (on host side) here. You need to use cl_ulong or uint64_t instead. And I see lots of uses of "long" too. This should never be used in host code - it may end up as 32-bit. You probably want to use int64_t or cl_long for them. The kernel build produces a boatload of warnings that you need to fix: --8<------8<------8<------8<------8<------8<------8<---- $ ../run/john -test -form:lyra2-opencl -dev=2 Benchmarking: Lyra2-opencl [Lyra2 OpenCL (inefficient, development use only)]... Device 2: GeForce GT 650M Build log: <program source>:49:6: warning: no previous prototype for function 'lyra2_initState' void lyra2_initState(__global ulong * state) ^ <program source>:109:6: warning: no previous prototype for function 'lyra2_absorbInput' void lyra2_absorbInput(__global ulong * memMatrixGPU, ^ <program source>:215:16: warning: comparison of integers of different signs: 'int' and 'unsigned int' for (i = 0; i < nBlocksInput * BLOCK_LEN_BLAKE2_SAFE_BYTES; i++) { ~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:220:2: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') memcpy(ptrByte, ptrByteSource, inlen); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:47:32: note: expanded from macro 'memcpy' #define memcpy(dst, src, size) gmemcpy(dst, src, size) ^~~~~~~~~~~~~~~~~~~~~~~ <program source>:43:16: note: expanded from macro 'gmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:227:2: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') memcpy(ptrByte, ptrByteSource, saltlen); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:47:32: note: expanded from macro 'memcpy' #define memcpy(dst, src, size) gmemcpy(dst, src, size) ^~~~~~~~~~~~~~~~~~~~~~~ <program source>:43:16: note: expanded from macro 'gmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:231:2: warning: comparison of integers of different signs: 'int' and 'unsigned int' glmemcpy(ptrByte, &kLen, sizeof(int)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:38:16: note: expanded from macro 'glmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:233:2: warning: comparison of integers of different signs: 'int' and 'unsigned int' glmemcpy(ptrByte, &inlen, sizeof(int)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:38:16: note: expanded from macro 'glmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:235:2: warning: comparison of integers of different signs: 'int' and 'unsigned int' glmemcpy(ptrByte, &saltlen, sizeof(int)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:38:16: note: expanded from macro 'glmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:237:2: warning: comparison of integers of different signs: 'int' and 'unsigned int' memcpy(ptrByte, &(salt->t_cost), sizeof(int)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:47:32: note: expanded from macro 'memcpy' #define memcpy(dst, src, size) gmemcpy(dst, src, size) ^~~~~~~~~~~~~~~~~~~~~~~ <program source>:43:16: note: expanded from macro 'gmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:239:2: warning: comparison of integers of different signs: 'int' and 'unsigned int' memcpy(ptrByte, &(salt->m_cost), sizeof(int)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:47:32: note: expanded from macro 'memcpy' #define memcpy(dst, src, size) gmemcpy(dst, src, size) ^~~~~~~~~~~~~~~~~~~~~~~ <program source>:43:16: note: expanded from macro 'gmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:241:2: warning: comparison of integers of different signs: 'int' and 'unsigned int' memcpy(ptrByte, &(salt->nCols), sizeof(int)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:47:32: note: expanded from macro 'memcpy' #define memcpy(dst, src, size) gmemcpy(dst, src, size) ^~~~~~~~~~~~~~~~~~~~~~~ <program source>:43:16: note: expanded from macro 'gmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:247:3: warning: comparison of integers of different signs: 'int' and 'unsigned int' glmemcpy(ptrByte, &nPARALLEL, sizeof(int)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:38:16: note: expanded from macro 'glmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:251:3: warning: comparison of integers of different signs: 'int' and 'unsigned int' glmemcpy(ptrByte, &thread, sizeof(int)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <program source>:38:16: note: expanded from macro 'glmemcpy' for(mi=0;mi<(size);mi++) \ ~~^ <program source>:300:16: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') for (i = 0; i < N_COLS; i++) { ~ ^ ~~~~~~ <program source>:349:16: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') for (i = 0; i < N_COLS; i++) { ~ ^ ~~~~~~ <program source>:388:16: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') for (i = 0; i < N_COLS; i++) { ~ ^ ~~~~~~ <program source>:424:6: warning: no previous prototype for function 'reducedDuplexRowFilling' void reducedDuplexRowFilling(ulong * state, ^ <program source>:462:16: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') for (i = 0; i < N_COLS; i++) { ~ ^ ~~~~~~ <program source>:518:6: warning: no previous prototype for function 'reducedDuplexRowWanderingParallel' void reducedDuplexRowWanderingParallel(__global ulong * memMatrixGPU, ^ <program source>:553:16: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') for (i = 0; i < N_COLS; i++) { ~ ^ ~~~~~~ <program source>:601:6: warning: no previous prototype for function 'absorbRandomColumn' void absorbRandomColumn(__global ulong * in, ulong * state, ^ <program source>:633:6: warning: no previous prototype for function 'wanderingPhaseGPU2' void wanderingPhaseGPU2(__global ulong * memMatrixGPU, ^ <program source>:778:16: warning: comparison of integers of different signs: 'unsigned int' and 'int' for (i = 0; i < fullBlocks; i++) { ~ ^ ~~~~~~~~~~ <program source>:788:6: warning: no previous prototype for function 'reducedDuplexRowFilling_P1' void reducedDuplexRowFilling_P1(ulong * state, ^ <program source>:819:16: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') for (i = 0; i < N_COLS; i++) { ~ ^ ~~~~~~ <program source>:875:6: warning: no previous prototype for function 'reducedDuplexRowWandering_P1' void reducedDuplexRowWandering_P1(__global ulong * memMatrixGPU, ^ <program source>:897:16: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int') for (i = 0; i < N_COLS; i++) { ~ ^ ~~~~~~ <program source>:941:6: warning: no previous prototype for function 'wanderingPhaseGPU2_P1' void wanderingPhaseGPU2_P1(__global ulong * memMatrixGPU, ^ <program source>:1046:16: warning: comparison of integers of different signs: 'unsigned int' and 'int' for (i = 0; i < fullBlocks; i++) { ~ ^ ~~~~~~~~~~ memory per hash : 384.00 kB --8<------8<------8<------8<------8<------8<------8<---- The "no previous prototype" can be avoided by always putting "static" or "inline" before *all* non-kernel functions. The auto tune doesn't seem to ever end using my GT650M. Using my even weaker Intel HD4000, it just segfaults. Using the CPU device, it fails at cmp_all(1). The pomelo-opencl format also produces a few warnings you need to fix, but works on the nvidia: Benchmarking: pomelo-opencl [OpenCL (inefficient, development use only)]... Device 2: GeForce GT 650M Build log: <program source>:283:16: warning: unused variable 'random_number' unsigned long random_number, index_global, index_local; ^ <program source>:283:31: warning: unused variable 'index_global' unsigned long random_number, index_global, index_local; ^ <program source>:283:45: warning: unused variable 'index_local' unsigned long random_number, index_global, index_local; ^ <program source>:279:19: warning: unused variable 'j' unsigned long i, j, y, from=loop->from; ^ <program source>:495:16: warning: unused variable 'random_number' unsigned long random_number, index_global, index_local; ^ <program source>:495:45: warning: unused variable 'index_local' unsigned long random_number, index_global, index_local; ^ <program source>:495:31: warning: unused variable 'index_global' unsigned long random_number, index_global, index_local; ^ <program source>:491:19: warning: unused variable 'j' unsigned long i, j, y; ^ memory per hash : 256.00 kB DONE Speed for cost 1 (t) of 2, cost 2 (m) of 2 Many salts: 6400 c/s real, 95085 c/s virtual Only one salt: 6462 c/s real, 83200 c/s virtual It passes self-test on CPU device too but on the Intel HD4000, it fails at cmp_all(3). magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.