|
Message-ID: <CAJocqxP+VXC1qoR8wJfZpUmdnFzK8+GwW7FkxsHL6qJC61QHdw@mail.gmail.com> Date: Sat, 10 Dec 2011 17:27:18 -0600 From: Wesley Tansey <tansey@...utexas.edu> To: Per Thorsheim <per@...rsheim.net> Cc: john-users@...ts.openwall.com Subject: Re: Password datasets with creation rules? Thanks Per. >In short: even if you do find any leaks of passwords that are clearly from environments with creation policies in place (length/complexity), you won't become much wiser without lots of additional info. Would you mind expanding on that? I'm not quite as interested in gaining summary statistics as I am in comparing the performance of a model on it. I've done a pretty exhaustive search at this point though, so I've kind of lost hope that I'll find one. The best I found was the MySpace dataset, which I believe required a non-alphabetic character, but of course that is very noisy data that requires filtering since it was retrieved via phishing so only 85% of the terms actually match that rule due to typos, mixups with a different password for some other site, etc. It's also a little small (35k after filtering) and the 6-7 letter passwords aren't as interesting from a cracking standpoint, so that leaves me with only about 7k. That's making it a very difficult dataset to work with. >My presentation at Passwords^11 has some statistics based on environments where I've had almost complete control of the corporate environments. Interesting presentation. Do you have a bibtex reference for it? Wesley On Sat, Dec 10, 2011 at 4:33 PM, Per Thorsheim <per@...rsheim.net> wrote: > On Fri, 2011-12-09 at 18:21 -0600, Wesley Tansey wrote: > > Does anyone happen to know of any decent-sized, real-world > leaked/attacked > > password datasets that are in the wild and employed password creation > rules > > such as "must contain a number" or "minimum 8 characters"? Plaintext, > > hashed, or hashed/salted are all fine as long as I can make a guess > against > > each entry and query for its existence in the database. I'm looking for > > full database releases, not just the cracked ones. > > > All of the datasets I've found that have decent sample sizes (rockyou, > > gawker, phpbb, battlefield heroes beta) seem to have no creation rules > > enforced. > > > > Wesley > > I'm tempted to say "It's not that easy". Well, it's not that easy. > > Some of the leaks available may have had creation rules, either on > "paper" or even technically implemented. However they may have changed > over time, strengthened or weakened... who knows? > > At least to me, from pentesting corporate environments, it is very > common to find written policies that are not technically implemented. Do > the password cracking, and you'll find passwords that are not in > compliance with any of the two. This could be due to lazy sysadmins, old > & unused accounts, frequent changes in password policies etc. > > In short: even if you do find any leaks of passwords that are clearly > from environments with creation policies in place (length/complexity), > you won't become much wiser without lots of additional info. > > My presentation at Passwords^11 has some statistics based on > environments where I've had almost complete control of the corporate > environments. You can find it here: > http://ftp.ii.uib.no/pub/passwords11/presentations/ (PDF, 1.1Mb) > > "The Exception" is the only environment I've ever seen where the average > passwords where "much" longer than the minimum required (length 3, no > complexity), see page 8. In environments where minimum length is 7+, > you'll typically see 50% of all acounts having passwords at the minimum > length. > > Pages 13 & 15, based on another data set, also shows of some very common > patterns from corporate environments in areas of per-position entropy > (total number of characters used in each position, and the most common > password formats found in environments with Windows default complexity > parameters (3 out of 4 character > > > > >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.