|
Message-ID: <20180926133532.GA22853@openwall.com> Date: Wed, 26 Sep 2018 15:35:33 +0200 From: Solar Designer <solar@...nwall.com> To: passwdqc-users@...ts.openwall.com Subject: Re: pwqgen vs diceware On Tue, Sep 25, 2018 at 10:00:43PM -0400, John Roman wrote: > I'm certainly not here to start a flame war, but I had wondered casually > which would be most suitable for a user generating a password: pwqgen, > or diceware? Of course, I prefer pwqgen, but I'm biased. > what is the random dictionary used for pwqgen? It's a list of 4096 English words found in the file wordset_4k.c in the passwdqc source tree, and starting with the following comment: * 4096 English words for generation of easy to memorize random passphrases. * This list comes from the MakePass passphrase generator developed by * Dianelos Georgoudis <dianelos at tecapro.com>, which was announced on * sci.crypt on 1997/10/24. Here's a relevant excerpt from that posting: * * > The 4096 words in the word list were chosen according to the following * > criteria: * > - each word must contain between 3 and 6 characters * > - each word must be a common English word * > - each word should be clearly different from each other * > word, orthographically or semantically * > * > The MakePass word list has been placed in the public domain * * At least two other sci.crypt postings by Dianelos Georgoudis also state * that the word list is in the public domain, and so did the web page at: * * http://web.archive.org/web/%2a/http://www.tecapro.com/makepass.html * * which existed until 2006 and is available from the Wayback Machine as of * this writing (March 2010). Specifically, the web page said: * * > The MakePass word list has been placed in the public domain. To download * > a copy click here. You can use the MakePass word list for many other * > purposes. * * "To download a copy click here" was a link to free/makepass.lst, which is * currently available via the Wayback Machine: * * http://web.archive.org/web/%2a/http://www.tecapro.com/free/makepass.lst * * Even though the original description of the list stated that "each word * must contain between 3 and 6 characters", there were two 7-character words: * "England" and "Germany". For use in passwdqc, these have been replaced * with "erase" and "gag". * * The code in passwdqc_check.c and passwdqc_random.c makes the following * assumptions about this list: * * - there are exactly 4096 words; * - the words are of up to 6 characters long; * - although some words may contain capital letters, no two words differ by * the case of characters alone (e.g., converting the list to all-lowercase * would yield a list of 4096 unique words); * - the words contain alphabetical characters only; * - if an entire word on this list matches the initial substring of other * word(s) on the list, it is placed immediately before those words (e.g., * "bake", "baker", "bakery"). * * Additionally, the default minimum passphrase length of 11 characters * specified in passwdqc_parse.c has been chosen such that a passphrase * consisting of any three words from this list with two separator * characters will pass the minimum length check. In other words, this * default assumes that no word is shorter than 3 characters. > are they similar? No, they're not very similar. passwdqc (pwqgen) uses 4096 words of lengths 3 to 6. Diceware uses 7776 "words" of a wider variety of lengths, and some are not actually dictionary words (e.g., digits). There are two major versions of Diceware - an older one, and a newer one introduced by EFF. The EFF one is better in some ways. As you note, passwdqc (pwqgen) typically alters the case of the first letter of words, which results in 8192 possibilities. Punctuation and digits between words add another 16 possibilities each (so a word with its adjacent separator character encodes 131072 possibilities). > as pwqgen generated phrases increase in size, so to do they increase in > difficulty to remember. this difficulty is bolstered by the strength > imparted by pwqgens random inclusion of case, numerics, and specials. passwdqc (pwqgen) chooses to alter the case of the first letter of words or not, and to include the random separator characters or not, depending on the amount of randomness you try to encode. For example, this doesn't use case and separators: $ for n in `seq 1 5`; do pwqgen random=24; done Suez-psyche emblem-unlike tread-shire afield-beetle grace-pitch ("Suez" was already capitalized in the input wordlist, and the separator is always a "-".) This encodes 2 bits more by starting to use the case: $ for n in `seq 1 5`; do pwqgen random=26; done humane-Pump Plunge-Orphan expand-Creepy tavern-lay Loft-Dense This encodes another 4 bits (6 bits more than the original) by also randomizing the separator: $ for n in `seq 1 5`; do pwqgen random=30; done juice-Cheeky Duly!philip Ginger$depot gloria9Fair Bat3Relate At these requested bit sizes, the alternative to using random case and separators would have been to add a third word, which I think would have been harder to memorize. But we may reasonably go for it when encoding even more randomness: $ for n in `seq 1 5`; do pwqgen random=36; done maze-nape-really tumble-ninety-Taiwan tube-kin-small Rhine-shape-shrill regime-purge-quake ("Taiwan" and "Rhine" were that way in the wordlist.) Again, adding 3 more bits starts to randomize the case: $ for n in `seq 1 5`; do pwqgen random=39; done Stench-scene-Fried Legal-Style-bid total-Prayer-Rid book-Bony-Urine Patent-Hamlet-Elder Adding another 8 bits (11 more bits total) randomizes the separators: $ for n in `seq 1 5`; do pwqgen random=47; done climb3pelvic$Creek Herb2Preach4Shoe chose7furry8Club flirt-cynic6ease height4Fault2Thence And the above is passwdqc's current default, encoding 47 bits in lengths ranging from 11 to 20. Unfortunately, 48 would be ambiguous in whether it'd request 4 words without case toggling and without random separators, or addition of another separator. So it does the latter, actually encodes 51 bits in 12 to 21 characters, which is much shorter than adding a fourth word. You can actually get four words starting with 52 bits, and that also includes case toggling: $ for n in `seq 1 5`; do pwqgen random=52; done Better-shaft-pest-trophy soil-him-cask-afghan mostly-Likely-Bravo-Libya Logic-Nasty-Gown-Lunar marble-Vocal-baltic-code Go for also using random separators, and you can encode 64 bits in four words. For most current purposes, you won't need to use more than this: $ for n in `seq 1 5`; do pwqgen random=64; done summon-rival$Rough_And pen5Wrist$swamp7tent mirror2Blood7League6Candle Deploy5Into4user3Tarzan yeah!outset_Bench8Cobra Note that 5 words without case toggling and without separators would only give you 60 bits. And the maximum you can currently use is 85 bits encoded in 5 words, with all the case toggling and specials: $ for n in `seq 1 5`; do pwqgen random=85; done Slug2Index$Stony3Click=item4 Icon!dark5folly9thing6Tort_ reform2Cobalt6Senior!newark-Adjust- Solemn=commit5uptake8Jersey*Cache$ danish-taxi&Differ5lounge8Damp6 Note that 7 words without case toggling and without separators would only give you 84 bits. > diceware offers high entropy passphrases at a low entry cost for the > user, but is a shorter 3 word pwqgen passphrase just as strong as a > longer 6 word passphrase from diceware? entropically they seem > identical. No, and no. A 3 word pwqgen passphrase encodes 36 (no random toggling and separators) to 51 bits (with random toggling, separators, and a trailing character). By default, it's 47. A 6 word Diceware passphrase encodes ~77 bits. Do you find 6 word Diceware easier to memorize and quick enough to type? If so, go for it (or use fewer words, see below). > pwqgwen offers greater possibility of acceptance from legacy password > systems that take fewer than 30 characters, but increases the potential > that a character might be suspect or unsupported. Diceware in turn can > be adulterated with a case, numeric, or special as needed, but might see > length issues. Right. > pwqgen states its capable of > 24-85 for entropy. diceware seems to appreciate ~77 bits of entropy. Right. > ive been testing entropy from this page: > http://rumkin.com/tools/password/passchk.php > > and here: > https://www.rempe.us/diceware/#eff > > its worth noting rumkins calculation for entropy seems a little high...a > 77 bit entropy phrase at diceware will yield a 200 entropy phrase, for > example...I wonder too what the appropriate entropy calculation is? You can't measure entropy of one password just by looking at it, nor by having a program look at it. Entropy of a random variable is a property of its distribution, not a property of any one value of it. Any attempts to estimate entropy by looking at a password are thus at best trying to make assumptions about the distribution, and those won't perfectly match distribution of passwords in the real world, nor that of a particular password generator. Moreover, Shannon entropy is not an appropriate metric for password strength (even if we could measure it accurately), except for passwords that came from a uniform distribution. So generated passwords are pretty much the only case where this metric is applicable, and for those we're lucky to be able to calculate the entropy from the generator's design and parameters - like we did above. These are the numbers you should use if you assume that your adversary knows (or can guess with almost no effort) that you use a password generator and with what word list and parameters. Also note that for most use cases going for pwqgen's maximum or Diceware's recommended 6 words is overkill. For example, there's little point in doing that for online services passwords if you use a unique password per service and plan on changing the password should you become aware that it leaked. It makes more sense to do it for data encryption, such as on your PGP or SSH key or on your encrypted filesystem. Modern implementations of those are switching to use of modern KDFs with high enough parameters, which should let you use simpler passphrases (albeit perhaps not as simple as those I'd recommend you use as unique passwords for online services), but you'd need to consider what KDF and with what parameters your software uses (or else use a 1-2 word longer passphrase). Here's a relevant answer I gave a few days ago: https://www.whonix.org/pipermail/whonix-devel/2018-September/001255.html I hope this helps, and I'm sorry if it's more than you wanted to know. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.