|
Message-ID: <20150523161928.GB599@openwall.com> Date: Sat, 23 May 2015 19:19:28 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: interleaving in SHA256 & SHA512 On Sat, May 23, 2015 at 02:27:47PM +0300, Aleksey Cherepanov wrote: > I count instructions and bytes of code with the following 2 commands: > > objdump -d JohnTheRipper/src/rawSHA512_my_fmt_plug.o | sed -ne '/<crypt_all>/,/^$/ p' > asm && wc -l asm > perl -pe 's/[^\t]*\t//; s/\t.*//' asm | tail -n +2 | perl -pe 's/\s+//g' | perl -lne 'print(length($_) / 2, " bytes of code")' For code size, you may want to keep the relevant function in a separate source file, producing a separate .o file, and simply use the size(1) command on the .o file. In fact, simply try: size rawSHA512_my_fmt_plug.o and see how it compares to your Perl's output above. > It's on core i7 950, with 64kb L1 cache. So there should be only 32kb > of cache for code. Yes, modern Intel CPUs have 32 KB for code and 32 KB for data. This is shared between 2 threads running on a core, so you should target up to 16 KB for code and 16 KB for data. As I told you via jabber, it is also possible to execute unrolled code at full speed out of L2 cache if you're very careful about instruction size - but you won't achieve that with gcc. For Haswell, you need to stay at <= 16 bytes per 3 instructions, which is do-able with careful choice of registers+offsets for the "memory" operands (actually using them as your extended virtual register file, giving up to 80 "registers"). I didn't test this on older CPUs. It might or might not be similar on Sandy/Ivy Bridge. I think this approach only makes much sense for bitslicing, and we should in fact explore it a bit later. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.