|
Message-ID: <BLU159-W22F2ABF3D98BEE6FD9A668A4710@phx.gbl> Date: Fri, 3 Feb 2012 15:33:28 +0000 From: Alex Sicamiotis <alekshs@...mail.com> To: <john-users@...ts.openwall.com> Subject: RE: DES with OpenMP > No parameters to tweak, I think - it's just different code. > > You may try building with -D_OPENMP instead of -fopenmp - that is, don't > actually enable OpenMP, but request that version of John's source code. > This should complain on truly OpenMP-specific constructs such as calls > to omp_get_max_threads(), which you'll need to remove (just put 1 for > the threads count, etc.) It should also give warnings about the > #pragma's, which you may ignore. > > You may analyze the generated assembly code and try to figure out why > one version of it is faster than the other on your CPU. + > This is getting off-topic for john-users, though (not just tweaking, but > source code changes) - want to join us on the john-dev list maybe? Analyzing or writing code code is not my kind of thing for the last...17 years or so. I did some BASIC in the 80's but for more serious programming I was using Pascal - never saw the need to learn another language like C because my programming needs were not that large to feel constrained by the language itself. There was no linux back then either as an incentive for C. Out of curiosity (a friend was into assembly that time) I did a bit of asm which was intriguing but my knowledge was limited in this sector - and now has been forgotten... Then I moved on to networks, html and then other things losing track of programming. When I'm seeing asm code today I can't even recognize much of the instructions or how they operate other than what I've read in a wiki about their functionality. So I wish I could help with the development but I can't really claim I know C or asm. I can make sense of some portions of c code which have comments (lol) but understanding, say, 10% of what you see and writing new, more optimized code, is an entirely different issue. However, if you need any benchmarking, or require testing for something, I'll be more than willing to assist. >With OpenMP, > the code is thread-safe, so it references the DES_bs_all structure via a > pointer. On one hand, this consumes a register (leaving fewer registers > for other stuff), but on the other it may result in smaller code size > (only need to encode offsets relative to a pointer rather than larger > absolute addresses) and thus more other stuff staying in L1 instruction > cache. Theoretically, this *sounds* easy to replicate if someone alters the non-omp version and uses a pointer just like the omp version. Then a compilation with icc can show whether the non-omp benefits from this despite the wasted register. Of course I have no idea how much code change this requires and if its even worth the time (it could just slow things down, due to the wasted register). > An easy thing to check is "size DES_bs_b.o". > Given my lack of ability to discern asm differences, I just did the easy thing, heh. I checked the sizes out of curiosity for GCC 4.6.2 and ICC 12.1 (.o files from plain compilation and -fopenmp copilation) ... there were huge differences in size (over than 10X) which is counter-intuitive relative to the 32kb of a l1 cache. Then I tried with GCC 4.3.2 which was fast in the non-omp version. Indeed this had differences too. Anyway I gathered most* benchmarks + file sizes in one spreadsheet: http://imageshack.us/f/217/resultsarein2.png/ * I left out an ICC batch of -march=core2. Doesn't have much difference performance wise so opted for the generic one.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.