|
|
Message-ID: <CAKGDhHWdfFhFQmgV2oqiYSJPC__u=q-FPfQ5+S9JT-uSRDR-Kw@mail.gmail.com>
Date: Tue, 2 Jun 2015 20:36:36 +0200
From: Agnieszka Bielec <bielecagnieszka8@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Parallel in OpenCL
2015-06-02 5:37 GMT+02:00 Lukas Odzioba <lukas.odzioba@...il.com>:
> I just compared code on two branches and I don't think that what you
> did it is the proper way of doing split kernel...
> I guess it should be clear to see using profiler.
>
>> GCN without "add 0" optimization
>> Device 1: Tahiti [AMD Radeon HD 7900 Series]
>> Many salts: 45093 c/s real, 4915K c/s virtual
>
>> GCN with unrolling one loop
>> Many salts: 27536 c/s real, 3276K c/s virtual
>
> The result you are giving here is for add 0 optimization and 4
> kernels, and I guess the latter is the problem here.
> This really confused me until I noticed a real big difference between
> two branches - not just unrolling one loop.
> Please be more specific in the future, otherwise we will be wasting time.
the speed decreases after change this loop:
for (int i = 0; i < 16; i++) {
t1 = k[i] + w[i] + h + Sigma1(e) + Ch(e, f, g);
t2 = Maj(a, b, c) + Sigma0(a);
h = g;
g = f;
f = e;
e = d + t1;
d = c;
c = b;
b = a;
a = t1 + t2;
}
to
for (int i = 0; i < 10; i++) {
t1 = k[i] + w[i] + h + Sigma1(e) + Ch(e, f, g);
t2 = Maj(a, b, c) + Sigma0(a);
h = g;
g = f;
f = e;
e = d + t1;
d = c;
c = b;
b = a;
a = t1 + t2;
}
for (int i = 10; i < 15; i++) {
t1 = k[i] + h + Sigma1(e) + Ch(e, f, g);
t2 = Maj(a, b, c) + Sigma0(a);
h = g;
g = f;
f = e;
e = d + t1;
d = c;
c = b;
b = a;
a = t1 + t2;
}
t1 = k[15] + w[15] + h + Sigma1(e) + Ch(e, f, g);
t2 = Maj(a, b, c) + Sigma0(a);
h = g;
g = f;
f = e;
e = d + t1;
d = c;
c = b;
b = a;
a = t1 + t2;
in the same branch parallel_opt as I said you on IRC.
this can't be fault of split kernels
and also if i change the second loop in similar way, speed decreases.
first loop:
Symbol table '.symtab' contains 15 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 186 OBJECT LOCAL DEFAULT 5 __OpenCL_compile_options
2: 00000000 640 OBJECT LOCAL DEFAULT 6 __OpenCL_0_global
3: 00000280 559 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
4: 00000000 40490 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_
5: 000004af 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
6: 000004cf 619 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
7: 00009e2a 25430 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_
8: 0000073a 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
9: 0000075a 633 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
10: 00010180 43430 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_
11: 000009d3 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
12: 000009f3 623 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
13: 0001ab26 43170 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_
14: 00000c62 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
second loop:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 186 OBJECT LOCAL DEFAULT 5 __OpenCL_compile_options
2: 00000000 640 OBJECT LOCAL DEFAULT 6 __OpenCL_0_global
3: 00000280 559 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
4: 00000000 40490 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_
5: 000004af 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
6: 000004cf 619 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
7: 00009e2a 56618 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_
8: 0000073a 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
9: 0000075a 633 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
10: 00017b54 43430 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_
11: 000009d3 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
12: 000009f3 623 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
13: 000224fa 43170 FUNC LOCAL DEFAULT 7 __OpenCL_parallel_kernel_
14: 00000c62 32 OBJECT LOCAL DEFAULT 6 __OpenCL_parallel_kernel_
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.