Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120322173544.GA1799@openwall.com>
Date: Thu, 22 Mar 2012 21:35:44 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Research ideas.

On Thu, Mar 22, 2012 at 09:09:42AM -0700, Alain Espinosa wrote:
> I was trying to implement and S-box assembly code generator. The new
> s-box are amazing but there is other optimization as well. Solar
> counts gates

Why, I actually count instructions as well.  Please take a look at those
cryptic comments before each S-box in nonstd.c and sboxes-s.c.

> but in s-box functions ~30% of code are implementation
> specific, like mov instructions. The s-box can be represented by a
> directed acyclic graph and develop an algotithm to generate code with
> the less instructions. I have done this manually i get ~20% better
> than Microsoft C compiler for SSE2 assembly and i think and automatic
> way speed up things more.

How does your code compare (e.g., in terms of instruction count) to
what's found in x86-64.S and x86-sse.S?  These were generated with my
Perl scripts, then hand-edited a bit.  For x86-64, gcc 4.3.x (only these
versions) generate very decent code as well (maybe even slightly better
than mine due to instruction scheduling across S-box boundaries), but
for 32-bit x86's SSE2 and MMX so far the Perl-generated and hand-edited
code is the best I have seen.

> There is also the fact that roman generate
> various possibles variants of each s-box.

There were thousands.  Sometimes tens of thousands even.  Roman and I
ran them through the Perl scripts I mentioned and picked those producing
better code - separately for different criteria: gate count, instruction
count on 2-op archs, register count, estimated stall counts on an
abstract single-issue CPU with different instruction latencies (2, 3, 4,
5, 6 cycles).  The latter also approximates multi-issue CPUs.  Since
there were multiple criteria, several versions of many of the S-boxes
were left in the released C files.

> > Another related project would be producing a bitslice implementation of
> > the Lotus5 hashing...
> 
> Is possible a bit-slice implementation of AES? I think is possible.

Yes.  You can find papers on that.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.