|
Message-ID: <20150905025151.GB23332@openwall.com> Date: Sat, 5 Sep 2015 05:51:51 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: md5crypt mmxput*() magnum, Jim, Simon, Lei - Some speedup for md5crypt on CPU might be possible through vectorizing the mmxput*() functions, or through use of SHLD/SHRD instructions (available since 386) or other archs' equivalents (I think ARM has this too) in mmxput3() when not vectorized (somehow gcc does not do it for us). These functions are similar to buf_update() in cryptmd5_kernel.cl, where I've added uses of amd_bitalign() and NVIDIA's funnel shifter recently (analogous to SHLD/SHRD), and which obviously is processed on the SIMD units on GPUs (can do it on CPUs as well, although no SHLD/SHRD then, unless a given CPU architecture has them in SIMD form as well - need to look into that). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.