Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 14 Feb 2018 03:12:39 +0100
From: Lukas Odzioba <lukas.odzioba@...il.com>
To: john-dev@...ts.openwall.com
Subject: Profiling John

Hi all,

I tried to reproduce old charts from wiki:
http://openwall.info/wiki/john/development/GPU
But OpenCL seems to be no longer supported on nV Visual Profiler.
AMD have it's CodeXL profiler which is kind of useful, but I switched
to nV due to constant driver issues and don't want go back again.

As far as I know gprof's flat profile should be what I need (time
based sampling), but I don't quite trust what I see there:

ukasz@...ris ~/ $ ./john  -test -format=md5crypt -length=8
Will run 8 OpenMP threads
Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 AVX 12x]... (8xOMP) DONE
Raw: 111744 c/s real, 14703 c/s virtual

ukasz@...ris ~/ $ gprof john | head -n 12
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 81.20      2.03     2.03  5444891     0.00     0.00  SSEmd5body
 16.00      2.43     0.40     5269     0.00     0.00  md5cryptsse
  2.80      2.50     0.07    16077     0.00     0.00  cfg_get_section
  0.00      2.50     0.00    33488     0.00     0.00  mem_alloc_tiny_func
  0.00      2.50     0.00    21299     0.00     0.00  str_alloc_copy_func
  0.00      2.50     0.00    15950     0.00     0.00  cfg_get_list
  0.00      2.50     0.00    11343     0.00     0.00  trim
ukasz@...ris ~/ $ ./john  -test -format=md5crypt-opencl -length=8
Device 0: GeForce GTX 1060 6GB
Local worksize (LWS) 64, global worksize (GWS) 32768
Benchmarking: md5crypt-opencl, crypt(3) $1$ [MD5 OpenCL]... DONE
Raw: 303407 c/s real, 303407 c/s virtual

ukasz@...ris ~/ $ gprof john | head -n 12
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 75.00      0.06     0.06    16086     0.00     0.00  cfg_get_section
 12.50      0.07     0.01      100     0.10     0.10  cmp_all
 12.50      0.08     0.01        1    10.00    19.02  fmt_self_test
  0.00      0.08     0.00   326700     0.00     0.00  set_key
  0.00      0.08     0.00   196511     0.00     0.00  longcand
  0.00      0.08     0.00    33485     0.00     0.00  mem_alloc_tiny_func
  0.00      0.08     0.00    32858     0.00     0.00  get_key


crypt_all should be somewhere at the top and it is reported as 0% of
the time, but call count is not zero.

To use gprof I modified Makefile after configure to have -pg in
LDFLAGS, CFLAGS and CFLAGSX (also removed -fommit-frame-pointer).
So far I had the most success with callgrind on the CPU side.

I am curious what do you guys use for John profling theese days (GPU
and CPU wise).

Thanks,
Lukas

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.