crypt-dev - Re: Intentionally Increasing Collisions in Password Hashing Algorithms

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ9ii1F3eMz0uYrLyxaOsUYQXZ2grWSJ-HUTdDysq18CoiyxYA@mail.gmail.com>
Date: Mon, 17 Dec 2012 00:19:40 -0500
From: Matt Weir <cweir@...edu>
To: crypt-dev@...ts.openwall.com
Subject: Re: Intentionally Increasing Collisions in Password
 Hashing Algorithms

One of the problems I’ve been running into is I don’t have a decent
model/framework to try and evaluate if increasing collisions is a good
idea or not. Below is my first attempt at creating that model for a
very, (very), low security case. I tried to give at least a basic
justification for all the assumptions I made, but everything here is
up for debate.

The first step is to define the maximum amount of risk allowed due to
collisions during an online attack. This will in turn determine the
minimum hash length and the maximum number of collisions we can
intentionally cause.

Online attack threat: This site makes no attempt to stop a
sophisticated attacker  who performs online guessing attacks. The
passwords simply have to stop unskilled attackers from logging in with
someone else’s account. By this it is assumed that the attackers would
manually type in the password guesses and will not devote much time to
making guesses. Think of your friends logging into your account so
they can change your status update to ‘butts’.

Number of guesses allowed: 100
Justification: It seems like after 100 guesses someone would get tired
of trying to log in.

Acceptable Risk: 1% risk of someone logging in
Justification: It’s rare enough that it would put a serious amount of
work on an unskilled attacker for a low chance of it succeeding. Yes
this is a wishy washy justification, but when have you last seen a
good justification for acceptable risk?

Plug those values into the equation:

2^H(x) = Number of Allowed Guesses / Chance of Success

Rounding up, (since you can’t have a fraction of a bit), a 14 bit hash
is the smallest hash we can use. Assuming the hash function is
equivalent to a random oracle, a collision is expected to occur once
every 16,384 guesses.

The next step is more complicated. How much will this impact an
attacker who has the goal of using this password against another site
and is performing an offline attack? First we need to estimate how
much effort an attacker is willing to devote to an offline cracking
attack per hash. Luckily we can ignore the number of compromised users
for now since it can be assumed that an unique password salt has been
used. While there is some potential speedup when attacking a large
number of salted hashes in GPUs, (especially with dictionary attacks
where the cost of transferring a guess to the GPU is expensive), we’re
not anywhere close to the point where we need to get into that level
of detail.

Hash Type: vBulletin (version 3.8.5)
Justification: This hash is A) weak, B) widely used, and C) salted.
Ideally we should be focusing on getting people to at least upgrade to
phpass before we start mucking around with hash collisions, but I
wanted to look at a really weak hash first.

Cracking speed: 4 billion checks a second
Justification: I’m basing my numbers off the following website:
http://thepasswordproject.com/oclhashcat_benchmarking
While an attacker can always throw more hardware at the problem, this
seems like a reasonable average case.

Time spent per hash before giving up: 5 minutes
Justification: There’s a lot of assumptions that go into this number.
My approach was to figure out how much a cracked hash is worth and
then compare that to the amount of money an attacker could make mining
bitcoins instead. I don’t know much about bitcoins, but according to
this website:

www.bitcoinx.com/profit

The yield from being able to perform 1.544 billion BC hashes a second,
(calculated using the above cracking setup though I might be widely
off), is around twelve cents a hour. What I don’t know is the going
rate for cracked password hashes at volume, (aka not a specific hash).
Referring back to the InsidePro board I do see one posting for $5 per
1k salted hashes cracked, (plus a $200 bonus for cracking a large
amount quickly), that received a number of replies/takers. Without the
bonus, that’s about 1/2 a cent per salted hash. With the bonus it goes
to 1 cent per hash. There’s a couple of other smaller postings as well
for around 1 cent a hash. So the break even point between bitcoin
mining and password cracking may be around 5 minutes per hash. But
wait, this completely ignores the fact that if the attacker doesn’t
crack the password in five minutes they get zero profit! Ideally I
should factor that in as well, but that gets into more math than I’m
willing to do with all the assumptions I’m already making. So I’m
sticking to five minutes for now.

I’d like to side track a bit and say how profoundly weird some of
these results are when you follow them through. It certainly *seems*
like someone’s password would be worth more than five minutes of work.
Of course i’m basing this of a public forum, (which has huge risks for
the seller, see the LinkedIn disclosure), so on private cracking
forums the cost might be much higher. I don’t have insight into those
unfortunately. Also this assumes all the crackers are intelligent
actors that are constantly trying to maximize their profits using all
the available information. Economists make this mistake *all* the
time. People fall into habits, don’t think about what they are doing,
or perhaps just prefer cracking passwords (heck I spend all this time
doing independent research vs getting a second job), so these numbers
might be completely bogus. As another side note, before I did the
above calculations I originally had 1 hour as a placeholder so my gut
feeling was *waaay* off.

Expected number of collisions if the password is not cracked:
=(4,000,000,000 guesses/s x 300 s) / 16,384

Which would mean the attacker can expect to generate an average of
73,242,187 collisions for each password they don’t manage to crack
during their cracking session.

Of course, as has been pointed out before, a significant amount of
passwords would be cracked very early on. Collisions after the true
password has been cracked can be ignored by an attacker because during
an online attack they’ll try the guesses in the order they were
cracked. For now we’ll also discount any additional effort an attacker
might put into queueing up cracked hashes before testing them against
other sites, (aka they might spend the full five minutes cracking a
hash regardless even if they crack the real password in the first
second). Unfortunately modeling password cracking attacks is fairly
difficult since there isn’t one standard way people perform these
attacks. Does the attacker use dictionary attacks; what dictionaries
do they use; which mangling rules do they use; what type of brute
force attacks do they run? The list goes on an on.

Type of Attack: Bruteforce, using JtR’s Incremental mode
Justification: The honest reason is I don’t want to bogged down
modeling a dictionary attack right now. I can say with a straight face
though that running brute force attacks against fast hashes is
generally more efficient when using a GPU cracker. The reason I picked
JtR’s Incremental mode vs HashCat’s BF++ is because Incremental is
much more effective. A graph of running JtR’s incremental mode,
(trained on a subset of the RockYou list and tested against an
different 1 million password subset of the RockYou list), for 20
billion guesses, (About a 5 second cracking session, *yikes*) can be
seen at the link below. Why only 5 seconds? Because my “checker”
program is really slow, (and most certainly not GPU friendly). I’m
going to post on the John-Users list to see if I can make better use
of JtR logs directly (which are much, much faster), but I wanted to
get some numbers out in the meantime. That being said, you can
reasonably estimate how a five minute cracking session would fare
based on the slope of the graph:

https://sites.google.com/site/reusablesec/Home/pictures/Incremental_20billion_guesses_Rockyou.png

So the short answer is around 48% of the passwords were cracked in a
“5 second” timeframe. Could a real attacker do better? Certainly. One
thing on my to-do list is to look at percentage of salted hashes
cracked for money on the InsidePro forum. Still this is a starting
number to work off of. For the rest of this document I’m just going to
base the expected collisions for a 5 second cracking session, (vs a 5
minute cracking session). Still, since the number of new cracked
passwords will fall dramatically as the cracking session goes on while
the rate of collisions stays the same, the percentage of collisions
per new hash cracked will rise over time.

So now we know for 52% of the un-cracked passwords, around 1.2 million
collisions will be generated in five seconds. Figuring out the
collisions for the other 48% cracked passwords is more complicated.
This is where Excel is really nice since I can simply estimate
collisions at points in time based on the data.  As a sanity check, my
Excel document is available below so you can check my math:

https://sites.google.com/site/reusablesec/Home/excel/14bithash_collisions_20billionGuesses_incremental_.xlsx

So on average, for a five second cracking session, an attacker can
expected to generate 635,215 collisions per hash even taking into
account that collisions don’t matter after the attacker has ‘cracked’
the real password.

I’d like to take a step back and say this is close to a best case
scenario for the defender as we’re likely to get. Aka I started with
the assumption that the original hashes were stored on a very low
security site. The real goal of this whole exercise was to try to
create a framework/model so that I could better define the assumptions
I was making. Also this provided a quick sanity check on if creating
collisions even has the possibility of being helpful. Based on the
initial results, I think this is worth following up further. 635k
collisions on average would be a huge cost to an attacker. Now there’s
many ways an attacker can respond to this. For example limit their
cracking time. Even with a brute-force session, an attacker can crack
around 8% of the passwords with less than 50 collisions expected per
hash; that will go down even further if a dictionary attack is used.
On the plus side for the defender, if an attacker adopts this
strategy, the attacker will only be cracking 8% of the passwords vs
48%.

There’s another potential benefit that I didn’t think about until
writing this. How would these collisions impact the password cracking
market? For example, how do you go about paying people to crack
passwords for you when there are a high number of collisions? Do you
ask for 1k possible values per hash? How do you ensure the crackers
are trying the most optimal guesses, (aka dictionary attacks or Markov
enhanced brute-force) vs just running dumb bruteforce attacks (which
may be faster), and selling you the collisions?

Now I can start filling in more conservative values to see the effect
this might have on higher security sites. I’m going to wait a little
bit to post on that though because this current email has grown way to
long, and quite honestly I’m interested in what everyone else thinks
of the assumptions I made.

Matt
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.