|
Message-ID: <87v9t17pfq.fsf@gmail.com> Date: Sun, 06 Oct 2019 20:29:29 +0300 From: Aleksey Cherepanov <lyosha@...nwall.com> To: john-users@...ts.openwall.com Subject: approaches to use old password as baseword for new hash matching by username/login (as in CMIYC 2019) Beware of spoilers. In "Crack Me If You Can 2019" online hash cracking competition held at DEFCON, KoreLogic made a distilled task to reflect real life scenario: attacker got old and new hashes with usernames/logins for each user. Old hashes use weak scheme, new hashes are slow and salted. But the weakness is in small change between passwords (if any at all). In this scenario, old passwords are very good basewords to crack new passwords. But the hashes are too slow to try all old passwords against all new hashes, so usernames should be used to match old and new hashes to transit findings. Working on this task, we (john-users team) got interesting findings not specific to the contest. TL;DR: ---------------------------------------------------------------------- In some cases, single mode (aka --single=<rules>) is a good choice to target attack on individual hashes one by one in a batch. I hope new options will be introduced to make it the best choice in most cases. The following tweaks are needed for CMIYC's scenario: - pack file with hashes as 'baseword:hash' or ':hash:::baseword', - other separator may be needed (e.g. --field-separator-char='\x01'), - use 'SingleRetestGuessed = N' or --single-retest-guess=N, - use 'SingleWordsPairMax = 0' and then 'JumboSingleWords = N', - for ':hash:::baseword' variant, add 'PristineGecos = Y' and then 'JumboSingleWords = N', - use --single=none to try basewords without default mutations. Simple scripting is not very bad performance wise, but has terrible UX. ---------------------------------------------------------------------- More info about the contest and its hashes may be found there: https://contest-2019.korelogic.com/stats-hashsets.html Described findings are not documentation. It is a representation of current state. Described behaviour may be changed later. ______________________________________________________________________ Findings shortly: - Sometimes new password is a small variation of old password, so it may be trivial to find new password having old password. - So let's say we cracked the fast hashes and have thousands of old passwords together with usernames. We need to try each old password against respective new hash with the same username (approximately 1:1). - We could try all old passwords against all new hashes. It may have sense when hashes are from one source. - But it is not the case in CMIYC. It would be too slow to try all old passwords as basewords for all new hashes. - We could extract new hash and respective old password to run separate attack. It should be easy with some scripting, right? Let's do a loop over username:old_password pairs, put old_password into file and pick target hashes by username. - File with username:old_password pairs was produced by other script as a part of centralized processing of team's results. - Good news: john's --users=... option allows to pick hash automatically by username or usernames. - In CMIYC, there were some users with suffixes "-a", "-b", "-c", so --users="$u,$u-a,$u-b,$u-c" was used to pick all variants. - Extreme case of 1 candidate for 1 hash in attack has noticeable overhead (e.g. when old password is the perfect match for new hash or there is 100% reliable rule to get new password from old). But it is not a problem when one needs to mutate old password in different way and try all variants. - The script got several improvements over time, so the task is a bit harder than it seems. - Also naive implementation would be suboptimal for recurring salts. In CMIYC, all salts were different. - It is hard to run it in parallel. - Terrible usability! - After all, a dispatcher was written to run as many loops as CPU threads available and to pick parts of work from a shared pool. As a bonus, it could run loops on remote machines (but pool could not be shared with other team members). It was terrible too, but it was much much better than one loop. The dispatcher showed how much we missed in CMIYC using the loop as the only approach and delaying implementation of dispatcher. - Alternatively we could pack old passwords and new hashes together as "old_password:new_hash" into one file and use single mode (--single), so john would try old password against only 1 respective new hash. - --single option may take parameter with name of rules section or with inline rules, so --single=None or --single=':' would try old password without modifications. - But there is a caveat: by default, each successful crack would be tried against all other new hashes in the file. --single-retest-guess=N on command line or 'SingleRetestGuessed = N' in config can disable it. - But some of old passwords contain ':', that is default field separator in files with hashes for john. - It is possible to inspect candidates in single mode with debugging format 'plaintext' (tag "$0$") and --verbosity=6: try to attack 'a,y:$0$p:b:c:d,x:e:f:g:h:i:j:k' with --single=None --verbosity=6. - It is possible to change the field separator with option --field-separator-char=C. - But the same field separator would be used .pot file. So it may be tricky to process results. - It is possible to specify the char in hex as \xNM, e.g. --field-separator-char='\x01'. - None of passwords in CMIYC contained '\x01'. So it was possible to use only '\x01' and ':', it would make scripts for processing quite easy. - It is possible to put old password into GECOS field (e.g. like ':new_hash:::old_password'), so the hash would be loaded but the baseword with ':' would be cut. - Leaving username in place (e.g. like 'u:h:::gecos') would produce more candidates. - Also 'PristineGecos = Y' and then 'JumboSingleWords = N' are required to reduce number of candidates. See other point. - 'PristineGecos = Y' does not solve the problem with ':'. It has other meaning. See other point. - There were not many passwords in CMIYC with ':', so it was possible to ignore them for single mode and cover by scripting approach. - Any separator chars (roughly anything that is not in set A-Za-z0-9) in username or in GECOS field gives additional candidates because john extracts words. - "a,b" in login field produces "a,b", "a,ba", "aa", "a,bb", "ab", "a", "aa,b", "b", "ba,b", "ba". - 'JumboSingleWords = Y' option in config would increase splitting (e.g. like JEdgarHoover to J Edgar Hoover). - "a,b" in GECOS field produces "a", "ab", "b", "ba". - 'PristineGecos = Y' in config would keep variants with separators as within login field. - 'PristineGecos = Y' turns on JumboSingleWords. So additional 'JumboSingleWords = N' is needed then. - 'SingleWordsPairMax = 0' reduces the number of additional candidates, but it does not disable splitting fully. Log shows that the option is increased automatically to cover more candidates than format's max. keys per crypt (mkpc). - 'SingleWordsPairMax = 0' turns on JumboSingleWords. So additional 'JumboSingleWords = N' is needed then. - mkpc is 8 for django-scrypt and plaintext formats. --mkpc=1 option does not affect the increase of SingleWordsPairMax for these formats. - I don't know any way to disable extraction of words without patching. Is there any? - Many old passwords in CMIYC had special chars, so single mode is slower than perfect. - In CMIYC, some sets of old passwords did not contain separator chars at all, so there would not be the performance penalty from single mode working on them. - Below there are rough tests with CMIYC's "putty" set (almost all passwords have '?', like "23?d?d1036?d") and bogus words without separators, with one candidate per hash (from --rules=':Az"?a"') and multiple candidates from bogus rules: - 1 candidate per hash, with separators, most are cracks: single mode with all tweaks is ~3x faster than the script, - 1 candidate per hash, with separators, no cracks: single mode with all tweaks is ~2x slower than the script, - multiple candidates per hash, with separators: single mode with all tweaks is ~11x slower than the script, - multiple candidates per hash, without separators: single mode is close to the perfect time, the script is slightly slower. - Single mode with 1 candidate per hash does not perform well with --fork= option: only one thread does all the work, others exit immediately. - Alternatively username matcher could be implemented right in john, so it would be possible to combine files of format 'username:new_hash' and of format 'username:old_password'. Maybe we'll see this feature in john. :-) ______________________________________________________________________ Some examples: Example of our script (reformatted): ---------------------------------------------------------------------- while IFS=: read -r u p; do printf '%s\n' "$p" > twl && grep -- "$u" results/uncracked/0.django-scrypt.slow-salted.target.pw && echo "$u" && ./JohnTheRipper/run/john \ --users="$u,$u-a,$u-b,$u-c" \ results/uncracked/0.*.target.pw \ --wordlist=twl \ --rules=': sq1 sw2 se3 sr4 st5 sy6 su7 si8 so9 sp0'; done < results/pair_user_crack/14.raw-md5.fast-nosalt.log2.txt ---------------------------------------------------------------------- So we read user:password pairs of cracked hint hashes, save password into temporary wordlist, pick users from the whole target file with --users="$u,$u-a,$u-b,$u-c". I added grep for speed. This version was written after the contest. One more improvement is possible: redirect grep into temporary file to be used by john instead of the full file. The script was supposed to be adapted for every pack of hints separately. It was a flexible approach. But it was hard to manage. The script was hard to run in parallel. But it had bearable performance for packs of hint hashes with one 100% good rule as the following packs: - loga3: --rules=':R' - putty: --mask="?w??a" (or --rules=':Az"?a"') - Alaska: md5($p) - Log2: --rules=': sq1 sw2 se3 sr4 st5 sy6 su7 si8 so9 sp0' But JBJ pack had a set of different rules to convert passwords from hint to passwords for target. It was the point when it should be obvious that performance mattered. Introspecting into generation of additional candidates in single mode: ---------------------------------------------------------------------- $ echo 'a:$0$p:b:c:d:e:f:g:h:i:j:k' > plaintext.pw $ ./JohnTheRipper/run/john plaintext.pw --verbosity=6 --single=: [...] set_key(a, 0) set_key(ad, 1) set_key(d, 2) set_key(da, 3) [...] $ echo 'abc123:$0$p' > plaintext.pw $ ./JohnTheRipper/run/john plaintext.pw --verbosity=6 --single=: [...] set_key(abc123, 0) [...] $ echo 'abc,123:$0$p' > plaintext.pw $ ./JohnTheRipper/run/john plaintext.pw --verbosity=6 --single=: [...] set_key(abc,123, 0) set_key(abc,123abc, 1) set_key(aabc, 2) set_key(abc,123123, 3) set_key(a123, 4) set_key(abc, 5) set_key(abcabc,123, 6) set_key(aabc,123, 7) Almost done: Processing the remaining buffered candidate passwords, if any. set_key(abc123, 0) set_key(123, 1) [...] ---------------------------------------------------------------------- Effect of 'SingleWordsPairMax = 0': ---------------------------------------------------------------------- $ echo 'a,b,c,d:$0$p' > plaintext.pw $ ./JohnTheRipper/run/john plaintext.pw --verbosity=6 --single=: 2>&1 | grep -c 'set_key(' 26 $ echo 'a,b,c,d:$0$p' > plaintext.pw $ printf '.include <john.conf>\n[Options]\nSingleWordsPairMax = 0\n' > t.conf $ cat t.conf .include <john.conf> [Options] SingleWordsPairMax = 0 $ ./JohnTheRipper/run/john plaintext.pw --verbosity=6 --single=: --config=t.conf --log-stderr [...] 0:00:00:00 - SingleWordsPairMax increased to 3 for high KPC (8) 0:00:00:00 - SingleWordsPairMax used is 3 [...] $ ./JohnTheRipper/run/john plaintext.pw --verbosity=6 --single=: --config=t.conf 2>&1 | grep -c 'set_key(' 15 ---------------------------------------------------------------------- Let's measure speeds. single_9_nocracked.pw was made from "putty" set and target hashes (with 180 hashes replaced by new variants; old variants were not included). The combination is not accurate, so I'll omit it. Script is modified to work with this one file. ---------------------------------------------------------------------- $ wc -l single_9_nocracked.pw 1751 single_9_nocracked.pw $ head -n 1 single_9_nocracked.pw ?d?d3?d?d3966:scrypt$J2yu/dctwknD$15$8$1$64$Tbtp3zmykGFe/rrg70yQbffw/ssrM1wEsS53zI40QN6n3evK4tENbechTGTdWRhb5/S/lOH+YwTcF+XYfAhrDQ== $ rm s9.pot ; time ./JohnTheRipper/run/john --no-log single_9_nocracked.pw --single=':Az"?a"' --pot=s9.pot [...] (It stalls quickly due to retests) $ rm s9.pot ; time ./JohnTheRipper/run/john --no-log single_9_nocracked.pw --single=':Az"?a"' --pot=s9.pot --single-retest-guess=N [...] 1571g 0:00:04:06 DONE (2019-10-06 16:13) 6.370g/s 19.53p/s 20.26c/s 20.26C/s dd39?a $ printf '.include <john.conf>\n[Options]\nSingleWordsPairMax = 0\nJumboSingleWords = N\n' > swpm0.conf $ cat swpm0.conf .include <john.conf> [Options] SingleWordsPairMax = 0 JumboSingleWords = N $ rm s9.pot ; time ./JohnTheRipper/run/john --no-log single_9_nocracked.pw --single=':Az"?a"' --pot=s9.pot --single-retest-guess=N --config=swpm0.conf [...] 1571g 0:00:03:09 DONE (2019-10-06 16:17) 8.279g/s 19.33p/s 20.27c/s 20.27C/s 10?a $ rm s9.pot ; time sh -c 'while IFS=: read p h; do echo "$p" > twl && echo "$h" > t.pw && ./JohnTheRipper/run/john --no-log t.pw --wordlist=twl --rules=":Az~?a~" --pot=s9.pot --format=django-scrypt; done < single_9_nocracked.pw 2>/dev/null 1>/dev/null'; wc -l s9.pot real 9m56.385s user 4m46.458s sys 2m13.355s 1571 s9.pot ---------------------------------------------------------------------- Rough estimate would be 88s to try 1 candidate per hash for 1751 hashes with unique salts at 20 c/s. All results together: - rough estimation: 1m28s - single, no retest: 4m06s - + swpm0.conf: 2m58s (the crack is the first in bunch) - with :Az"no" rule: 16m08s (all additional candidates are tried) - modified script: 9m56s Let's pick 20 hashes and try 100x more rules (':Az~[0-9][0-9]~'): - rough estimation: 1m40s - single, no retest: 28m12s - + swpm0.conf: 18m29s - modified script: 1m44s Let's replace basewords with words without delimiters: ---------------------------------------------------------------------- $ head -n 20 single_9_nocracked.pw | perl -C0 -pe 's/^[^:]*:/abc$.qwe:/' > t.pw ---------------------------------------------------------------------- So 20 hashes with 100 rules, basewords don't contain separators: - rough estimation: 1m40s - single, no retest: 1m40s - + swpm0.conf: 1m40s - modified script: 1m44s Stats about separators against our cracks including cracks found after the contest (# with separators, total #, file name): ---------------------------------------------------------------------- $ for f in results/cracked/[1-9]* ; do \ printf '%5d %5d %s\n' \ "$(grep -c '[^A-Za-z0-9]' "$f")" \ "$(wc -l < "$f")" \ "$(basename $f)"; \ done (w/sep total file name) 612 1508 10.raw-sha1.fast-nosalt.alaska.txt 4 21 11.nt.fast-nosalt.alaska.txt 147 957 12.nt.fast-nosalt.bluemangroup.txt 200 932 13.raw-sha1.fast-nosalt.coredump.txt 0 1001 14.raw-md5.fast-nosalt.log2.txt 168 758 1.raw-md5.fast-nosalt.seeessseeessteedee.txt 1418 3770 2.nt.fast-nosalt.buckbuckbuck.txt 0 1001 3.raw-md5.fast-nosalt.log1.txt 0 1317 4.raw-md5.fast-nosalt.speak.txt 743 1801 5.raw-sha1.fast-nosalt.s8sux.txt 3655 9971 6.nt.fast-nosalt.jbj.txt 0 1001 7.raw-md5.fast-nosalt.loga3.txt 8186 9868 8.raw-md5.fast-nosalt.wolf.txt 2305 2308 9.mysql-sha1.fast-nosalt.putty.txt ---------------------------------------------------------------------- Thanks! -- Regards, Aleksey Cherepanov
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.