|
Message-ID: <CAKGDhHVovVAx=dagO5e_wOigXCUy9M=cHC=V5hG+1bv0kwzAtw@mail.gmail.com> Date: Sat, 6 Jun 2015 12:41:10 +0200 From: Agnieszka Bielec <bielecagnieszka8@...il.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Lyra2 on CPU it seems that speed for both b) and c) versions after allocating memory beyond hash function and moving nCols and nThreads to salt doesn't differ on my laptop. tests on super: version c) [a@...er run]$ ./john --test --format=lyra2 Will run 32 OpenMP threads Benchmarking: Lyra2, Generic Lyra2 [ ]... (32xOMP) DONE Speed for cost 1 (t) of 8, cost 2 (m) of 8 Many salts: 1394 c/s real, 44.8 c/s virtual Only one salt: 1411 c/s real, 45.9 c/s virtual [a@...er run]$ GOMP_CPU_AFFINITY=0-31 ./john --test --format=lyra2 Will run 32 OpenMP threads Benchmarking: Lyra2, Generic Lyra2 [ ]... (32xOMP) DONE Speed for cost 1 (t) of 8, cost 2 (m) of 8 Many salts: 17696 c/s real, 554 c/s virtual Only one salt: 17664 c/s real, 553 c/s virtual version b) [a@...er run]$ ./john --test --format=lyra2 Will run 32 OpenMP threads Benchmarking: Lyra2, Generic Lyra2 [ ]... (32xOMP) DONE Speed for cost 1 (t) of 8, cost 2 (m) of 8 Many salts: 11904 c/s real, 372 c/s virtual Only one salt: 11722 c/s real, 370 c/s virtual [a@...er run]$ GOMP_CPU_AFFINITY=0-31 ./john --test --format=lyra2 Will run 32 OpenMP threads Benchmarking: Lyra2, Generic Lyra2 [ ]... (32xOMP) DONE Speed for cost 1 (t) of 8, cost 2 (m) of 8 Many salts: 112 c/s real, 3.8 c/s virtual Only one salt: 112 c/s real, 3.8 c/s virtual [a@...er run]$ GOMP_CPU_AFFINITY=0-31 ./john --test --format=lyra2 in version b) can be so slow because my strange construction of crypt_all(). but it looks like omp has problems with barriers or there is something I don't know . I wrote simple program in C #include <stdio.h> #include <omp.h> static void func() { printf("checkpoint 1\n"); printf("threads_num=%d, my_thread_num=%d\n",omp_get_num_threads(),omp_get_thread_num()); #pragma omp barrier printf("checkpoint 2\n"); printf("threads_num=%d, my_thread_num=%d\n",omp_get_num_threads(),omp_get_thread_num()); #pragma omp barrier printf("checkpoint 3\n"); printf("threads_num=%d, my_thread_num=%d\n",omp_get_num_threads(),omp_get_thread_num()); #pragma omp barrier printf("checkpoint 4\n"); printf("threads_num=%d, my_thread_num=%d\n",omp_get_num_threads(),omp_get_thread_num()); } int main() { int i; #pragma omp parallel for for(i=0;i<2;i++) { func(); } } and the output is: none@...e ~/Desktop $ ./omp checkpoint 1 checkpoint 1 threads_num=8, my_thread_num=1 threads_num=8, my_thread_num=0 checkpoint 2 threads_num=8, my_thread_num=1 checkpoint 2 threads_num=8, my_thread_num=0 [here program blocks] if I change 2 in for(i=0;i<2;i++) to 8 program works OK.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.