|
Message-ID: <20110223020816.GA22205@openwall.com> Date: Wed, 23 Feb 2011 05:08:16 +0300 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: bitslice DES on AVX On Wed, Feb 23, 2011 at 04:13:42AM +0300, Solar Designer wrote: > Now I need to figure out how to make DES_BS 3 work. Well, _mm256_blendv_ps() was a wrong intrinsic for what I meant to use. The correct one could be _mm256_cmov_ps(), but it is not recognized by my gcc 4.5.0 for whatever reason. I was able to get the desired vpcmov instructions generated by using -mxop and __builtin_ia32_vpcmov_v8sf256(). Bad news: I was wrong in thinking that Intel has since "imported" this stuff into AVX. Apparently, they did not, and it's XOP-only, hopefully to be found in AMD's CPUs to be made available later this year. Benchmarking: Traditional DES [256/256 BS AVX]... Illegal instruction at address = 408e27: 8f 48 4c a2 c5 40 c5 7c 29 84 24 e8 05 00 00 solar@owl:~/john/john-1.7.6-avx/src $ objdump -d ../run/john | fgrep -w 408e27 408e27: 8f 48 4c a2 c5 40 vpcmov %ymm4,%ymm13,%ymm6,%ymm8 At least it's not in the emulator. I guess Sandy Bridge CPUs are similar. So we're stuck with DES_BS 1 for now, but we may use 256-bit vectors, which may or may not be faster than 128-bit (depends on how current CPUs implement them). If they're not faster than 128-bit, we may still benefit from AVX' support for 3-operand instructions, so this is something to be benchmarked. That is, benchmark not just 256-bit AVX vs. SSE2, but also include 128-bit AVX in the comparison. AVX+SSE2 at once (384-bit virtual vectors) probably makes little sense since they will have to share the 16 registers. It's like 8 registers per implementation. Yet it's worth trying if the compiler manages to allocate registers to each implementation in a non-conflicting fashion. AVX+MMX (320-bit) might turn out to work better - at least the registers are separate. This might be worth trying too. And it looks like we'll need to support XOP separately from AVX, so the various combinations with XOP will need to be tried too... Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.