crypt-dev - Re: Password Scrambling

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131224004958.GA18444@openwall.com>
Date: Tue, 24 Dec 2013 04:49:58 +0400
From: Solar Designer <solar@...nwall.com>
To: crypt-dev@...ts.openwall.com
Subject: Re: Password Scrambling

Christian (one or/and the other), all -

On Tue, Sep 03, 2013 at 08:00:58PM +0200, CodesInChaos wrote:
> On 9/2/2013 9:58 AM, Christian Forler wrote:
> > Nevertheless, we came up with Catena, a new memory-hard
> > password scrambler based on the bit reversal function. A detailed
> > description of our scheme is available on eprint
> > (http://eprint.iacr.org/2013/525).
> 
> Doesn't the standard "store every 1/sqrt(n)th value and recompute the rest
> using parallel cores" attack using a parallel computer break this?

What's the current understanding on this?  I think the attack does work
against Catena, although I only skimmed over the paper.  Does this mean
some of the proofs in the paper are flawed?

Not indexing by secret is very nice, but I have yet to see a O(N^2) KDF
that avoids indexing by secret.  With typical/expected uses of scrypt
(and bcrypt) the problem happens to be mitigated by the fact that the
salt is normally not known to an attacker unless the hash is also known.
In either case (salt not known or salt+hash are both known), the
side-channel attack is almost irrelevant.  It can be used to infer which
user account a password is being tested against, but nothing about the
password and the hash.

> In phase 1 you output values in the order 1 to n.
> In phase 2 you iterate the values in some other, but fixed order.
> 
> Now an attacker uses two different machines for those two phases:
> 
> Phase 1: run phase 1 normally, but only store every 1/sqrt(n) th value in
> the natural order of phase 2 for storage. This step runs in time n and
> needs memory sqrt(n) for a total cost of n^1.5.
> Phase 2: iterate in the standard phase 2 order. You can recompute elements
> you didn't store in time sqrt(n), but since you can use sqrt(n) parallel
> cores and memory access is predictable, those elements are available
> instantly when the 1 sequential iterator needs them. Runs in time n and
> needs sqrt(n) space.
> 
> Total cost is n^1.5. But that contradicts your security claim (and proof)
> that space*time is n^2 on a parallel random access machine i.e. that the
> function is sequentially memory? Why doesn't this attack work?

Other assorted feedback on Catena (again, based on having skimmed over
the paper only):

Of the two properties "marketed" as unique to Catena - Server Relief and
Client-Independent Updates - I see most value in a sub-property of the
latter, namely being able to increase not only processing cost, but also
memory cost, without knowing the plaintext password.  However, I do not
(yet) fully understand how and whether this sub-property is achieved.
I think reviewing and experimenting with an actual implementation might
help me understand and evaluate this in less time than (re-)reading the
paper for real would.  Is an implementation publicly available?  (In any
case, it's not long until it must be, for PHC.)

I had very briefly commented on this desirable property here:

http://lists.randombit.net/pipermail/cryptography/2012-November/003451.html

"A much trickier task: support upgrades to a higher memory cost for the
already-computed iterations.  Sounds impossible at first?  Not quite.
This would probably require initial use of some secret component
(allowing for a lower-memory shortcut) and then dropping it at upgrade
time."

Does Catena drop a previously-available part of the salt+hash for the
memory cost upgrades?  Does it achieve the memory cost increase "for the
already-computed iterations", or maybe only for added iterations?
Or is the wording I had used somehow not applicable to Catena?

As to running time upgrades, they're very nice to have, but are
implementable on top of any existing KDF by rolling its output into
another invocation of it.  The advantage of having it built-in (and the
disadvantage of not having that in most other KDFs) is only in being
able to say simply that "we use [e.g.] Catena" rather than "we use
[e.g.] scrypt with its output passed back to input n times, as encoded
along with each hash in a custom way in our database".

A disadvantage of having the running time upgrades built-in, and not as
an additional optional outermost loop (like described above in the
around-scrypt example), is that one of the following must be true:

1. The inner workings of the KDF - the part that provides its area-time
cost for attackers - must use crypto strong enough to represent the
cryptographic strength of the entire KDF.  This no longer can be some
weaker "non-cryptographic" data mixing (whatever works best in area-time
terms), or we're no longer able to easily and convincingly show that our
KDF as a whole is not less cryptographically secure than e.g.
PBKDF2-HMAC-SHA-*.  As a result, we might have to compromise in terms of
the achieved area-time cost (albeit only by a constant factor).

2. The KDF must have some let's say sequence points, at which
data-mixing is interleaved with strong crypto.  These will be potential
cost upgrade points.  If we're not going to record them along with each
hash (or encrypted filesystem, etc.), separately for each cost upgrade
actually performed, then we need to have them pre-programmed into the
KDF, to occur at regular intervals.  This will likely reduce the KDF's
memory cost (the area component in the area-time cost for attacker).

Additionally:

3. At least with straightforward approaches, the memory cost is also
reduced by having to compact the internal state to match the KDF output
size (or less) at the sequence points mentioned above.  This may be
avoided in the way I had hinted at in the cryptography list posting I
referenced above (discard portions of secret inputs when upgrading).
Does Catena use this approach?  Actually, the paper sounds like it does.
Does this also happen to avoid the unfortunate choice between drawbacks
1 and 2 above, or does one of them apply to Catena (I guess it'd be
number 1 if so)?

As to the proposed Server Relief feature and algorithm, it is also
implementable on top of any other password hashing scheme, by wrapping
it in a fast hash (and this had been suggested on some occasions).
If custom processing or protocol like that can be implemented for a
given deployment, then chances are that the same server could also offer
e.g. SCRAM, which would also protect against replay (and would not make
use of the slow hash having Server Relief as a standard feature).  So I
am not sure if there's much value in having this property built into the
hash.  OK to have it for free, but implementing it is relevant to the
drawbacks 1 vs. 2 above.

Finally, the paper appears to use the word stretching to refer to both
iterations and throwing salt bits away.  IIRC, the Kelsey et al. paper
(reference [14] in the Catena paper) established the term stretching to
refer only to the iterations, and another term - strengthening - to
refer to dropping of salt bits.  Also, I previously saw the term pepper
as referring to a possible site-specific local parameter (site salt),
whereas the Catena paper uses it differently.  It could be preferable to
stick to the way these terms had been used historically.

I am sorry if I sound too negative.  I actually appreciate the effort
very much, and it might influence my own work.  It's just that I think
feedback on what might be drawbacks can be most helpful.

I am also sorry for not having read the paper thoroughly yet.  I thought
that it could help if I go ahead and comment anyway, instead of delaying
this to possibly after the PHC submission deadline (as I don't expect to
look into this much more closely until we're done with our own submission).

Thanks,

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.