![]() |
Message-ID: <CAJ9ii1Eoa8jpZG1EEK4hRSrhubb3_V2zEdQYtKTe4m6KjY+RVA@mail.gmail.com> Date: Mon, 12 Mar 2018 16:19:53 -0400 From: Matt Weir <cweir@...edu> To: passwords@...ts.openwall.com Subject: Submitting Partial Password Hashes to Pwned Password Lookup Background: With over 500 million passwords in Troy Hunt’s Pwned Passwords V2, Troy and Cloudflare have partnered to provide an API lookup so sites don’t have to download the full list. Here is an example: https://api.pwnedpasswords.com/range/21BD1 Users send the first five characters of their password hash to the pwned passwords server, and the server returns a list of all the hashes matching it. The user can then look to see if their full hash is in the list. The most prominent tool to use this capability so far is 1Password, though it’s starting to pop up in other places as well. Reference links: https://www.troyhunt.com/ive-just-launched-pwned-passwords-version-2/ https://www.troyhunt.com/i-wanna-go-fast-why-searching-through-500m-pwned-passwords-is-so-quick/ https://blog.cloudflare.com/validating-leaked-passwords-with-k-anonymity/ https://blog.agilebits.com/2018/02/22/finding-pwned-passwords-with-1password/ Concerns: I just want to start by saying that I have the highest respect for Troy, Cloudflare and 1Password. Full disclaimer, I’m a happy user of HaveIBeenPwned and 1Password. As an additional disclaimer, in the past I’ve argued for investigating the use of partial password hashes to protect users from server compromises (link: http://www.openwall.com/lists/crypt-dev/2012/12/12/3 ). The question that is raised by this approach is, “what is the risk to end users?” One way to grapple with risk is to use the “cyber attack lifecycle” methodology. Breaking out the tuple that is required for a successful online attack, (usernames, passwords, sites), the chance that all of them being met by an adversary exploiting this service who couldn’t obtain them in another way is likely manageable. My concern stems from the fact that this type of approach to risk management hasn’t been how the security has been framed. Walking through the steps required for a successful exploitation can bring up additional security checks. For, example: Looking at the problem this way raises the question “Are all of the replies from the pwned passwords service padded to the same size to protect against passive sniffing attacks?” Instead the conversation has focused on the idea of k-anonymity. To be blunt, I’ve become more and more convinced that k-anonymity is not the way to model the security of this system. The best analogy I can think of is the past use of Shannon entropy to measure password strength. I’ll agree that longer passwords “on average” are stronger then shorter passwords, but we’ve all seen Shannon entropy used to justify some completely unfounded security claims. The same goes for k-anonymity where a focus on that property can potentially lead to some undesired outcomes. Justification: I’ve talked with several other expects and we all struggled with how to apply k-anonymity to this problem. 1) One could argue that k-anonymity is being applied to the pwned passwords list. The username and sites associated with password hashes have been stripped. Unfortunately, the raw lists are available for most researchers/attackers if they know where to look. That’s how they made it into the list in the first place. 2) Likewise, it doesn’t fit to say that k-anonymity is being applied to the user submissions. Since the attacker knows the user, (or site/IP), which is doing the submission, it isn’t anonymous. Yes, their query leaves open collisions with multiple passwords. But this better resembles a classic data leakage issue vs an anonymity issue. To put it another way, if the user was submitting the first character of their plaintext password (don’t do this), we’d model this as a data leakage/keyspace reduction vs a k-anonymity problem. This is a longer way of saying I'm doubtful that modeling the risk via k-anonymity tells the defender the security risk of potentially leaking the data about the first K characters of the users password hash. My gut says other methods have a better chance for this such as, Honeywords, password modeling techniques like PCFGs, and ironically enough Shannon’s Entropy. This e-mail has already grown too large as it is, but I’d be interested in other people’s thoughts on this subject. Am I misunderstanding the use of K-anonymity? How should we look at the security of this approach? Cheers, Matt
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.