|
Message-ID: <CAKGDhHWdfFhFQmgV2oqiYSJPC__u=q-FPfQ5+S9JT-uSRDR-Kw@mail.gmail.com> Date: Tue, 2 Jun 2015 20:36:36 +0200 From: Agnieszka Bielec <bielecagnieszka8@...il.com> To: john-dev@...ts.openwall.com Subject: Re: PHC: Parallel in OpenCL 2015-06-02 5:37 GMT+02:00 Lukas Odzioba <lukas.odzioba@...il.com>: > I just compared code on two branches and I don't think that what you > did it is the proper way of doing split kernel... > I guess it should be clear to see using profiler. > >> GCN without "add 0" optimization >> Device 1: Tahiti [AMD Radeon HD 7900 Series] >> Many salts: 45093 c/s real, 4915K c/s virtual > >> GCN with unrolling one loop >> Many salts: 27536 c/s real, 3276K c/s virtual > > The result you are giving here is for add 0 optimization and 4 > kernels, and I guess the latter is the problem here. > This really confused me until I noticed a real big difference between > two branches - not just unrolling one loop. > Please be more specific in the future, otherwise we will be wasting time. the speed decreases after change this loop: for (int i = 0; i < 16; i++) { t1 = k[i] + w[i] + h + Sigma1(e) + Ch(e, f, g); t2 = Maj(a, b, c) + Sigma0(a); h = g; g = f; f = e; e = d + t1; d = c; c = b; b = a; a = t1 + t2; } to for (int i = 0; i < 10; i++) { t1 = k[i] + w[i] + h + Sigma1(e) + Ch(e, f, g); t2 = Maj(a, b, c) + Sigma0(a); h = g; g = f; f = e; e = d + t1; d = c; c = b; b = a; a = t1 + t2; } for (int i = 10; i < 15; i++) { t1 = k[i] + h + Sigma1(e) + Ch(e, f, g); t2 = Maj(a, b, c) + Sigma0(a); h = g; g = f; f = e; e = d + t1; d = c; c = b; b = a; a = t1 + t2; } t1 = k[15] + w[15] + h + Sigma1(e) + Ch(e, f, g); t2 = Maj(a, b, c) + Sigma0(a); h = g; g = f; f = e; e = d + t1; d = c; c = b; b = a; a = t1 + t2; in the same branch parallel_opt as I said you on IRC. this can't be fault of split kernels and also if i change the second loop in similar way, speed decreases. first loop: Symbol table '.symtab' contains 15 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 186 OBJECT LOCAL DEFAULT 5 __OpenCL_compile_options 2: 00000000 640 OBJECT LOCAL DEFAULT 6 __OpenCL_0_global 3: 00000280 559 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 4: 00000000 40490 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_ 5: 000004af 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 6: 000004cf 619 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 7: 00009e2a 25430 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_ 8: 0000073a 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 9: 0000075a 633 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 10: 00010180 43430 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_ 11: 000009d3 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 12: 000009f3 623 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 13: 0001ab26 43170 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_ 14: 00000c62 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ second loop: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 186 OBJECT LOCAL DEFAULT 5 __OpenCL_compile_options 2: 00000000 640 OBJECT LOCAL DEFAULT 6 __OpenCL_0_global 3: 00000280 559 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 4: 00000000 40490 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_ 5: 000004af 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 6: 000004cf 619 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 7: 00009e2a 56618 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_ 8: 0000073a 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 9: 0000075a 633 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 10: 00017b54 43430 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_ 11: 000009d3 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 12: 000009f3 623 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_ 13: 000224fa 43170 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_ 14: 00000c62 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.