Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130407155304.GB31073@openwall.com>
Date: Sun, 7 Apr 2013 19:53:04 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com, deepika dutta <deepikadutta_19@...oo.com>
Subject: Re: des-bs-doubt

Deepika,

On Sat, Apr 06, 2013 at 01:42:15PM -0700, deepika dutta wrote:
> For plaintext: Swapping the 32 bit halves before doing encryption though in the DES standard these halves are swapped after the last round.

These are two halves of the ciphertext, not the plaintext.  We're trying
to minimize the amount of work we have to do per candidate password
tested.  By swapping the two halves in the hashes loaded for cracking,
we avoid having to do it per candidate password.  With bitslicing, the
latter could be as simple as reading from element 32 instead of from 0,
so no obvious runtime cost anyway, but this approach dates back to
before JtR started using bitslice DES.

Additionally, JtR can still be built to use non-bitslice DES as well.
In fact, "make generic" tries this, and on some purely 32-bit systems
without SIMD and without fast(er) L1 caches the two S-box lookups at a
time non-bitslice implementation (128 KB for the S-boxes) turns out to
be faster, so it gets used instead of bitslice.  I last saw this happen
in my testing in QEMU emulating 32-bit SPARC and ARM.

With the halves swapped, these different implementations are more
consistent in their representation of the loaded hashes ciphertexts.

> For key: Storing the 7 bits per byte in little endian form in DES_bs_all.K

This is probably for historical / code evolution reasons.  It was not
written straight from DES spec and into its current form, but rather it
evolved through many intermediate representations, each in form of C
code, over many years.

> Is there any performance issue in this? If I want to swap the plaintext after last round and store the 7 bits per byte in big endian form in DES_bs_all.K, then will there be any problem in this representation? (of course I would have to change the indexes of B and K referred in sbox functions).

I think you can likely do both without incurring performance impact, if
you do it right.  The change in memory access pattern may have some
performance impact on some systems, though.  It is not obvious to me
whether the change in key order bits will make things faster or slower
in case the data has to be read into L1 cache rather than is already in
L1 cache.  You need to examine the order of key bits after key expansion
for that.  In general, sequential order (lowest address accessed first,
then higher addresses) is preferable.

This is probably not a change we'll want to accept into JtR, unless you
show that it actually improves performance.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.