
MessageID: <CAKws9z2YU7371hEPA9xH=XLVh_x2bpRC3yKidtD9sBqPpQQ03Q@mail.gmail.com>
Date: Sat, 9 Jul 2016 14:09:57 0400
From: Scott Arciszewski <scott@...agonie.com>
To: passwords@...ts.openwall.com
Subject: Re: Don't Scratch Your Entropy
Spot on. Entropy must describe the password pool your password exists in,
not the password itself.
An example most developers might be more familiar with: If you generate a
1000 character, random printable ASCII password, but use LCG or MT19937 to
generate it, the maximum entropy you'll enjoy is 32 bits. Naive entropy
estimates (i.e. lg(95^1000) or 6569.8556) offer no insight into the
resistance of your password to being guessed.
> The entropy is a function of a distribution of a random value.
Correct.
> (a) your password's entropy is 0
Mathematically, yes. In real terms, I would say it's a meaningless value to
measure, sort of like asking what the voltage is at a single point without
another point to compare the potential?
> (b) every "security expert" pronouncing "entropy", without defining the
distribution or at very least the pool of candidate passwords, is a brain
dead buffoon.
That's a bit harsh.

Do you know of any particularly egregious sources of misinformation
offhand for this issue? If they're on Stack Exchange, for example, I can
make clarifying edits.
Scott Arciszewski
Chief Development Officer
Paragon Initiative Enterprises <https://paragonie.com>
On Sat, Jul 9, 2016 at 12:00 PM, e@...tmx.net <e@...tmx.net> wrote:
> I have a strong conviction that 99% of "security experts" do not know the
> definition of the entropy. This conviction does certainly seem wildly
> deranged for you, unless you know the definition in question. So, let's
> begin with the definition, by the book.
>
> H = sum(p_i * log(p_i))
>
> This is a function of the probability vector P = {..., p_i, ...} that
> represents a distribution of a random variable. Entropy is a characteristic
> of a distribution of a random variable. No more and no less.
>
> Let us find the entropy of your password. Your password's distribution
> vector is {1}, therefore your password's entropy is:
>
> H = 1 * log(1) = 0
>
> Your password's entropy is ZERO. Try log(1) in different bases on
> different computers if you are unsure.
>
> A sophisticated reader may ask: "What if we apply entropy to the password
> creation procedure?" It is doable in seemingly reasonable way. We can model
> any password creation procedure as a random choice from a pool of candidate
> passwords, then characterize the password distribution over this pool with
> the entropy. The resulting number will tell us how much information our
> procedure represents. So what? Is this number of any use in the context of
> "password security"?
>
> Security experts usually jump in here and claim that this number
> represents the strength of the produced password. For the argument sake,
> let's accept this claim, and construct a password creation procedure as
> follows:
> the password pool is {"123", "password", "gtfr3467ujhbvcddgy6r5ddsefvvs",
> "###"},
> we toss two coins and pick one from this four according to the coin toss
> outcome.
>
> The entropy of this procedure is (given the coin toss produces uniformly
> distributed outcomes):
> H1 = (1/4) * log(1/4) * 4 = 2
>
> Now (according to the mainstream computer "science" (dictated by the NIST
> recommendations)) we must label all our passwords with this entropy value:
> "123" has the entropy based strength 2
> "password" has the entropy based strength 2
> "gtfr3467ujhbvcddgy6r5ddsefvvs" has the entropy based strength 2
> "###" has the entropy based strength 2.
>
> Looks somewhat counter intuitive, and not at all what you used to think
> about the "entropy" as being pronounced by a respectable "expert" with a
> straight face.
>
> Furthermore, we can define another password creation procedure:
> toss one coin and pick from the pool
> {"123","gtfr3467ujhbvcddgy6r5ddsefvvs"}.
> The entropy of this procedure is (twice less than the previous): 1.
> Therefore:
> the password "123" has the entropy based strength 1.
>
> The very same password "123" that also has the strength 2. A password has
> two different strengths simultaneously. If we understand the "strength" as
> a likelihood of being guessed by the attacker, then a single password can
> not have two different values, because the password alone is the input
> argument for the hypothetical attack, not the password creation procedure.
>
> Thus, accepting the premise: "the password creation entropy characterizes
> a produced password", we end up with a contradiction. Entropy is
> demonstrated to be not a function of a password. However, in a little less
> mentally insane world I should have skipped this lengthy demonstration
> altogether. The entropy is just defined as a function of a random
> distribution  who would have thought that it is also NOT a function of
> anything else!
>
> But I am not a champion of taking the longer route to obvious conclusions.
> Matt Weir have conducted a meticulous experiment with leaked passwords to
> make the statement: "entropy based password strength measures do not
> provide any actionable information to the defender", and also: "there is no
> way to convert the notion of Shannon entropy into the guessing entropy of
> password creation policies". In other words, he gave us an experimental
> evidence that the entropy is irrelevant to the password strength problem.
> Of course, it is irrelevant! This irrelevance is plainly written in the
> entropy definition. Matt, you could have just read the definition and say:
> "corollary, dear 'experts', don't scratch your entropy". Nevertheless,
> these experimental results are of a great value for humanity, and I am glad
> we have them, the more evidence the better. In this world of imbeciles,
> even the most obvious facts require tons of "proofs", so far as the
> "experts" does not go along with math logic very well.
>
> Still there is more to the topic! Not only the entropy of an accurate
> password creation model is irrelevant to the problem of password strength,
> but also the model itself is not possible in real life usecases. What
> distribution are you going to apply to human created passwords? Given that
> (a) humans are incapable of randomization (b) the pool of passwords they
> choose from is not accessible to us, not even by vivisection of the brain.
> This fact makes the entropy even worse than irrelevant, it makes the
> entropy ARBITRARY  whatever distribution we assume for a human created
> password it is inevitably baseless arbitrary garbage.
>
> Let's recap:
>
> The entropy is a function of a distribution of a random value.
>
> Corollary:
>
> (a) your password's entropy is 0
>
> (b) every "security expert" pronouncing "entropy", without defining the
> distribution or at very least the pool of candidate passwords, is a brain
> dead buffoon.
>
Content of type "text/html" skipped
Powered by blists  more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.