Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 18 Mar 2013 10:40:16 +0400
From: Solar Designer <solar@...nwall.com>
To: crypt-dev@...ts.openwall.com
Subject: Re: scrypt TMTO defeater

On Sun, Mar 17, 2013 at 08:03:21AM +0400, Solar Designer wrote:
> Unfortunately, it turns out that BlockMix's shuffling(*) gets in the way
> of efficient implementations, and _possibly_ the inefficiency mostly hurts
> cleaner CPU code, as opposed to hacks and ASICs.  Specifically, when the
> V element we write into is or might be the same as V_j, in a clean
> implementation we have to postpone those writes until the BlockMix
> completes, and this means re-reading the same values from memory
> (hopefully, from L1 cache) rather than simply storing register values
> into two locations at once (such as X and V_j).

The above applies when trying to store (or XOR) BlockMix output into a V
element, but that in itself is problematic: if V_j on the very next
iteration of the SMix loop is the same as the element we just wrote
into, the XORs may cancel each other, in some cases resulting in really
nasty consequences.  The previous TMTO defeater patches I posted in the
last 2 days suffer from this problem.  Oops.  I hope this is taken for
granted, but to be on the safe side let me say that none of these
early/experimental TMTO defeater patches are meant for actual use.

Now, I could fix this by adding even more complexity, such as to avoid
repeating j's (detect repeats and XOR with 1 if so, which may be done in
a branch-less fashion).  However, I chose to rethink the whole approach
instead, and realized that Anthony's original example did not have this
problem due to storage of the block value from just prior to a BlockMix.

A reason why I did not try implementing Anthony's original example right
away was that it was incompatible with an optimization I had made in
crypto_scrypt-sse.c, where the XORs were re-ordered.  Now I went for the
extra implementation complexity in this version of the code,
implementing a separate BlockMix function with the original order of
XORs so that the right value could be written back into V_j.  This
appears to work well, and is fast.  -ref and -nosse implement Anthony's
original example without such extra complexity.

New revision is attached.  Comments are welcome.

New MD5 of output of "tests" with defeat_tmto = 1:
2843045848a204727c8dd7677bd6b8e3

Because of all the partial bypasses of this kind of TMTO defeater (that
I've documented in separate postings), I am likely to also proceed to
implement a defeater by means of changes to the first SMix loop.  Then
the question will be whether to keep the defeater in the second loop as
well or not.

Alexander

View attachment "escrypt-defeat-tmto-30.diff" of type "text/plain" (17136 bytes)

Download attachment "escrypt-0.0.30.tar.gz" of type "application/x-gzip" (10529 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.