|
Message-ID: <lusyxhpbcus8o6smktcgq825.1433352137819@email.android.com> Date: Wed, 03 Jun 2015 13:24:42 -0400 From: Alain Espinosa <alainesp@...ta.cu> To: john-dev@...ts.openwall.com Subject: Re: bitslice SHA-256 Hi. I had some free time and tried bitslice SHA256 in Neon. The results are as expected. Assembly output size is 19KB that is more than the L1 code cache of this CPU, but I do not see performance drops because of it. Benchmark configuration: Android 4.4.2, GCC 4.6, Snapdragon 801 2.45GHz, only one thread Performance is given in millions of keys per second ------------------------------------------------------------------------------------------------------------------ 2.61 : Bitslice SHA256 implemented with hand-crafted Neon assembly (5.7% faster than normal, 35% faster than intrinsics) 2.47 : Normal SHA256 implemented with hand-crafted Neon assembly 1.94 : Bitslice SHA256 implemented with Neon intrinsics 0.83 : Bitslice SHA256 implemented with 64-bits code Attached the Neon intrinsics and hand-crafted assembly source file. The VBSL (Neon bitselect) appears to be more costly than normal bitwise instructions. For practical speed-ups with bitslice SHA256 we need XOP or AVX512 instruction-sets. AVX512 probably provides speed-ups for SHA1 format also. MD5/MD4 formats uses less rotation/shifts, so bitslice is less useful and probably never practical. Regards, Alain Content of type "text/html" skipped Download attachment "bs_sha256_v3.zip" of type "application/zip" (15294 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.