|
Message-ID: <CABob6iow51JQ2SptbYUjH7s1d33QFNRoM-jyo79z35iv+csZ2Q@mail.gmail.com> Date: Wed, 14 Feb 2018 03:12:39 +0100 From: Lukas Odzioba <lukas.odzioba@...il.com> To: john-dev@...ts.openwall.com Subject: Profiling John Hi all, I tried to reproduce old charts from wiki: http://openwall.info/wiki/john/development/GPU But OpenCL seems to be no longer supported on nV Visual Profiler. AMD have it's CodeXL profiler which is kind of useful, but I switched to nV due to constant driver issues and don't want go back again. As far as I know gprof's flat profile should be what I need (time based sampling), but I don't quite trust what I see there: ukasz@...ris ~/ $ ./john -test -format=md5crypt -length=8 Will run 8 OpenMP threads Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 AVX 12x]... (8xOMP) DONE Raw: 111744 c/s real, 14703 c/s virtual ukasz@...ris ~/ $ gprof john | head -n 12 Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 81.20 2.03 2.03 5444891 0.00 0.00 SSEmd5body 16.00 2.43 0.40 5269 0.00 0.00 md5cryptsse 2.80 2.50 0.07 16077 0.00 0.00 cfg_get_section 0.00 2.50 0.00 33488 0.00 0.00 mem_alloc_tiny_func 0.00 2.50 0.00 21299 0.00 0.00 str_alloc_copy_func 0.00 2.50 0.00 15950 0.00 0.00 cfg_get_list 0.00 2.50 0.00 11343 0.00 0.00 trim ukasz@...ris ~/ $ ./john -test -format=md5crypt-opencl -length=8 Device 0: GeForce GTX 1060 6GB Local worksize (LWS) 64, global worksize (GWS) 32768 Benchmarking: md5crypt-opencl, crypt(3) $1$ [MD5 OpenCL]... DONE Raw: 303407 c/s real, 303407 c/s virtual ukasz@...ris ~/ $ gprof john | head -n 12 Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 75.00 0.06 0.06 16086 0.00 0.00 cfg_get_section 12.50 0.07 0.01 100 0.10 0.10 cmp_all 12.50 0.08 0.01 1 10.00 19.02 fmt_self_test 0.00 0.08 0.00 326700 0.00 0.00 set_key 0.00 0.08 0.00 196511 0.00 0.00 longcand 0.00 0.08 0.00 33485 0.00 0.00 mem_alloc_tiny_func 0.00 0.08 0.00 32858 0.00 0.00 get_key crypt_all should be somewhere at the top and it is reported as 0% of the time, but call count is not zero. To use gprof I modified Makefile after configure to have -pg in LDFLAGS, CFLAGS and CFLAGSX (also removed -fommit-frame-pointer). So far I had the most success with callgrind on the CPU side. I am curious what do you guys use for John profling theese days (GPU and CPU wise). Thanks, Lukas
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.