|
Message-ID: <20150427193948.GA15903@openwall.com>
Date: Mon, 27 Apr 2015 22:39:48 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: experiment with macros for bitslice
I think it is possible to make a set of macros to do bitslice. Partly
it is inspired by DES_bs_b.c: there are macros for different platforms
that provide uniform interface to ops on different platforms and to
different sizes, so you write vxorf(a, b) but it may be expanded into
((a) ^ (b)) , _mm256_xor_ps((a), (b)) or something else. I tried to
make bitsliced variants the same way.
I tried to make macros that expand xor(dst, a, b) into
dst_bit0 = a_bit0 ^ b_bit0;
dst_bit1 = a_bit1 ^ b_bit1;
...
dst_bit31 = a_bit31 ^ b_bit31;
rotate left by 3:
#define frotate3left32(r, var) \
r ## _bit3 = var ## _bit0; \
r ## _bit4 = var ## _bit1; \
[...]
r ## _bit31 = var ## _bit28; \
r ## _bit0 = var ## _bit29; \
r ## _bit1 = var ## _bit30; \
r ## _bit2 = var ## _bit31;
Approach with _bit# means that all rotates and shifts should be
written manually. It is not nice. To bypass that, arrays may be used
to represent "variables" instead of separate vars.
I did not analyze assembly to say that optimization removes
assignments from rotation. I hope they are no-op.
The whole macro approach means that some macros can't be easily
populated. Though simple ops may be shrunk:
#define f_regular_op32(r, a, b, op) \
op(r ## _bit0, a ## _bit0, b ## _bit0); \
op(r ## _bit1, a ## _bit1, b ## _bit1); \
[...]
op(r ## _bit31, a ## _bit31, b ## _bit31);
#define f_single_xor(r, a, b) r = a ^ b
#define fxor32(r, a, b) f_regular_op32(r, a, b, f_single_xor)
The code attached demonstrates rough implementations of xor and rotate
left by 3 with some test code.
Thanks!
--
Regards,
Aleksey Cherepanov
View attachment "t.c" of type "text/x-csrc" (26261 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.