john-dev - Hashes with different tunable costs (was: Re: Handling of hashes with different iteration counts)

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <538D8967.60405@mailbox.org>
Date: Tue, 03 Jun 2014 10:37:59 +0200
From: Frank Dittrich <frank.dittrich@...lbox.org>
To: john-dev@...ts.openwall.com
Subject: Hashes with different tunable costs (was: Re: Handling
 of hashes with different iteration counts)

Alexander,

On 04/11/2014 10:26 AM, Solar Designer wrote:
> On Fri, Apr 11, 2014 at 12:02:24AM +0200, Frank Dittrich wrote:
>> If p is 1, then N*r = m_cost = N*r*p = t_cost. This doesn't seem to be
>> very useful.
> 
> Yet it makes sense.
> 
>> For now (not committed yet, and thus no pull request), I decided to
>> report N*p as t_cost and r*p as m_cost.
>> (As long as p == 1, just N and r are reported; p > 1 influences both
>> t_cost and m_cost. Does that make sense?)
> 
> No, this makes no sense to me.
> 
> With scrypt, the only way to tune t_cost without affecting m_cost is via p.
> That's the reality.  What you're doing is reusing my suggested t_cost
> and m_cost names to mean something totally different.  That's confusing.
> Please don't do that!

I decided to report three tunable cost values for django-scrypt:
N, r, and p.
(I increased FMT_TUNABLE_COSTS to 3.)
I would have preferred to report values that correspond to memory or
time usage. But if that's not trivial, I'd rather report the real
value of each tunable cost parameter here.

1 0:00:00:00 Cost 1 (N) is 14 for all loaded hashes
1 0:00:00:00 Cost 2 (r) is 8 for all loaded hashes
1 0:00:00:00 Cost 3 (p) is 1 for all loaded hashes

The above lines are from the log file. If a particular tunable cost has
the same value for all remaining hashes, I'll report them in the log file.
This is another change, in a separate commit.
If the hashes have different cost values, they are reported both on
stdout and in the log file, but identical values are just reported in
the log file.
I thought logging this information might help explaining why the c/s
value is much lower than what you might expect from a john --test run:

$ ./john --test --format=bcrypt
Will run 8 OpenMP threads
Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]...
(8xOMP) DONE
Raw:	3384 c/s real, 423 c/s virtual

$ ./john bcrypt
Loaded 14 password hashes with 7 different salts (bcrypt [Blowfish 32/64
X3])
Will run 8 OpenMP threads
Press 'q' or Ctrl-C to abort, almost any other key for status
0g 0:00:00:09 0.32% 2/3 (ETA: 23:45:49) 0g/s 56.00p/s 400.0c/s 816.0C/s
conrad..keith
0g 0:00:00:18 0.57% 2/3 (ETA: 23:52:12) 0g/s 53.33p/s 384.7c/s 784.7C/s
picasso..steven1
0g 0:00:00:30  2/3 0g/s 51.14p/s 371.9c/s 746.2C/s manson..oreo
Session aborted

0:00:00:00 Loaded a total of 14 password hashes with 7 different salts
0:00:00:00 Sorting salts, for performance
0:00:00:00 Cost 1 (iteration count) is 256 for all loaded hashes
0:00:00:00 - Hash type: bcrypt (lengths up to 72)

Both changes are now pulled into bleeding-jumbo.

> BTW, in yescrypt, I've introduced t - a way to tune t_cost without
> affecting either m_cost or p.  So with yescrypt you'd have:
> 
> m_cost = N * r
> t_cost = N * r * f(t) * ((flags & YESCRYPT_PARALLEL_SMIX) ? 1 : p)

That means, for yescrypt I'll need to increase FMT_TUNABLE_COSTS to 5
and report N, r, t or f(t), flags, p?

> whereas with classic scrypt it is:
> 
> m_cost = N * r
> t_cost = N * r * p

[...]

> I understand we might want to group hashes not only by their full m_cost
> and t_cost, but also e.g. by N and r individually, as the differences in
> memory access pattern may turn into speed differences (this is why these
> settings are separately tunable, after all).  However, this simply does
> not fit into the m_cost and t_cost model.  If you want to include such
> support, you need to include it explicitly: as ability to group hashes
> by individual hash type specific parameters.  For yescrypt, you would
> then also need to support grouping separately by p, t, and flags.  Do we
> really want to introduce that support proactively?

The only changes required if a new format needs to report more than 3
tunable costs are:
-Adjust the FMT_TUNABLE_COSTS definition in formats.h
-add more than 3 trivial functions (returning an unsigned
int value calculated from a particular salt) to this format
-(optionally) list more than 3 names/descriptions for the tunable costs
of this format,

No other format needs to change.

A minor adjustment for --list=format-methods is required if a user wants
to use
./john --list=format-methods:tunable_cost_value[3]
to see all formats which report at least 4 tunable costs.
Currently, only these are valid "method names":
./john --list=format-methods:tunable_cost_value
./john --list=format-methods:tunable_cost_value[0]
./john --list=format-methods:tunable_cost_value[1]
./john --list=format-methods:tunable_cost_value[2]

(bleeding-jumbo)src $ git grep -n FMT_TUNABLE_COSTS
...
listconf.c:562:#if FMT_TUNABLE_COSTS > 1
listconf.c:564:#if FMT_TUNABLE_COSTS > 2
...

Frank
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.