john-users - team john-users write-up for "Crack Me If You Can 2018" contest at DEFCON

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <855zzv3n86.fsf@gmail.com>
Date: Mon, 27 Aug 2018 17:20:25 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-users@...ts.openwall.com
Subject: team john-users write-up for "Crack Me If You Can 2018" contest at DEFCON

As it was announced, we competed in the Pro category in Crack Me If You
Can 2018 hash cracking competition held at DEFCON conference. We are
grateful to KoreLogic who organized this exciting contest. The contest
allowed us to test new code in John the Ripper under extreme conditions
and learn more about the art of cracking. We would like to thank other
teams for impressive tug. You make us better! Thank you!

Contest Website [1]:
[1] https://contest-2018.korelogic.com/


    Team Members

Aleksey Cherepanov
Andreas Egeberg
Luis Rocha
Matt Weir
rofl0r
soxrok2212
trebla

We had 7 active members. Only 4 members were able to dedicate 2 full
days to the contest.


    Hardware Used

CPUs:       ~164 cores / ~328 threads.
CPUs, peak: ~492 cores / ~984 threads (only last 3.5 hours).

GPUs: ~10.

FPGAs:        8 chips / 2 ZTEX 1.15y boards.
FPGAs, peak: 24 chips / 6 ZTEX 1.15y boards (only last 14 hours).

Peak speed: 44-45k c/s on bcrypt (cost 10):
- 22k c/s by 6 ZTEX boards,
- 20k c/s by 396 CPU cores in 13 spot instances in Amazon EC2,
- 1500-3000 c/s by part of other hardware including CPUs and GPUs,
and some more hardware was working on other formats.


    Software used

- John the Ripper bleeding-jumbo [2]
- hashcat [3]
- hashcat utils [4]
- PACK [5]
- chasm [6]
- Probabilistic Context Free Grammar (PCFG) trainer [7]
- custom scripts written for this contest [8]
- auto chmod [9]
- non-specific software like ssh, irc, mua, text editors, web-browsers

I'd like to give credits to some of awesome developers of John the
Ripper whose work we used a lot in this contest:

- Denis Burykin aka Apingis, who implemented bcrypt-ztex format that
  rips bcrypt hashes at incredible rate comparable to hundreds of
  CPU cores.

- magnum, who improves john constantly and made john very pleasant to
  use.

- Claudio André, who prepared binary packages [10] right before the
  start.

- Sayantan Datta, who implemented "fast hashes on GPU" idea in GSoC
  with incredible quality that impresses me 3 years in a row already.

In previous contests, we had slow starts due to very careful handling
of hashes: we prepared canonical forms of all hashes before
submissions. But in this contest, every hash was a race. So Aleksey
dropped all previous scripts and wrote his own from scratch. A
continuous uploader (pot_sender.py and pot_receiver.py) was written an
hour before the start, and a system for submissions (simple.py and
submit1.sh) was written after the uploader with simplicity as the main
priority. It was passing lines from .pot files to the orgs avoiding
dupes. Slightly more complicated handler for cracks (results.py) was
written a bit later and contained inconvenient bugs (e.g. cumulative
john.pot was incomplete)... The scripts with all the bugs are
available at [8] for your amusement (not for production use). ;-)

[2] https://github.com/magnumripper/JohnTheRipper
[3] https://hashcat.net/hashcat/
[4] https://github.com/hashcat/hashcat-utils
[5] https://github.com/iphelix/pack
[6] https://github.com/Cynosureprime/chasm
[7] https://github.com/lakiw/pcfg_cracker
[8] https://github.com/AlekseyCherepanov/lq_CMIYC2018
[9] https://github.com/AlekseyCherepanov/contest-tools
[10] https://github.com/claudioandre-br/packages/tree/master/john-the-ripper


    The first cracked bcrypt

Luis Rocha cracked the first bcrypt before other teams. He started
with wide attacks using mask mode on GPU against raw-md5 quickly
exploring the landscape. It allowed him to find a pattern: two chars +
"Pass". So he dispatched --mask=?l?lPass on 36 cores against bcrypt as
soon as possible. The result was the first bcrypt in the competition.


    The list of cracked hashes

In this contest, a list of all cracked hashes was available. Hashes
with timestamps were provided to all teams. These hashes gave only 1/4
of normal points, but they were easier to crack due to high number of
salts.

Each salted type of hashes was represented by 300k hashes with unique
salts. So one needed to try 300k candidate+salt combinations just to
check 1 password against all the hashes. But once a team cracked some
hashes, all other teams knew that these hashes were easier than others
and were from 1 pattern probably in case of slow hashes like bcrypt
and md5crypt.

rofl0r used this property to crack several hundreds of bcrypt hashes
in the end of the first day with low power.


    The good pattern

Attack on a subset of hashes is a good trick for salted formats. But
it needs quite big pattern that covers noticeable fraction of hashes.

A pattern of combination of member's names from other teams was found.
We iterated scraping of write-up several times and got quite big list.
But the list was dirty and/or incomplete.

To improve it, Aleksey wrote scripts to extract parts automatically
(parts.py to find popular pairs and subs.py to find just popular
substrings [8]). The resulting lists were too big and dirty, but they
were used with hybrid mask mode to find more parts using brute force of
the beginning and the end of passwords. It brought new names.

Meanwhile incremental mode revealed combinations of 3 words by 3
letters like "werrayLOL" and "wowbbbLOL".

So Aleksey wrote a script (check-parts.py [8]) with the only purpose to
find combinations of words. Idea of fully automatic extraction was
postponed, so the script was showing statistics for manually prepared
wordlist of parts. After some work, we got a quite big list of words
that can be combined. Then a list of names was extracted.

The list:
K9
User
tony
Rolf
Amd
radix
T0XlC
Minga
Milzo
2018
hops
cvsi
atom
blaz
Waffle
Crack
2018!
xmisery
philsmd
usasoft
legion
espira
Defcon
dEFCON
DEFCON
blazer
2018!!
Hydraze
Xanadrel
NullMode
chancas
EvilMog
CrackMe
m3g9tr0n
winxp5421
BlandyUK
Dropdead
rurapenthe
unix-ninja
CrackMeIf
purehate
LasVegas
kontrast23
dakykilla

The list was sorted by frequency of usage in cracks found at the
moment of creation. After the contest, we know that we missed "Jimbas"
and "s3in!c" at least.

This pattern covered 3% of our raw-md5. It was enough to be found on a
subset: 1000 hashes should give ~30 cracks.

So Aleksey combined parts using PRINCE mode (with --prince= option in
john) and tried on a random subset of 1000 uncracked bcrypt hashes and
got ~30 cracks with the first 20% (~15k candidates). It took 20
minutes on 4 ztex boards.

trebla found that bcrypt should start with h-m or H-M letters (or with
punctuation/specials). So the combinations were generated and saved
into a file (with --stdout), then filtering with `grep` happened. Then
the first 15872 candidates were chosen for the work (the number was
based on the initial 20% that was not meaningful after filtering, and
the number was aligned for buffering of ztex boards).

The resulting wordlist was as efficient as expected: for every subset
of 1000 hashes, it gave 20-40 cracks. We did not have power to try the
whole pattern, but we still could become #1 applying it on a fraction
of bcrypt hashes. We had 10 hours till the end.

A simplistic system (client_run.sh and get_part1.sh [8]) for
distribution of workload was written, so this pattern against bcrypt
was running on 20 computers with 3 GPUs and 2 sets of ztex boards, but
only during the last 3.5 hours of the contest. Unfortunately writing
of scripts took more time than expected. Also it distracted attention
from the cracked hashes.

The cracking was finished after the contest and gave 8k additional
cracks, i.e. more than 1/2 of the wordlist were passwords.

Possible improvement 0 (obvious): do preparations for the contest
better and earlier: write scripts before the contest, setup access to
all hardware beforehand to save this time for actual cracking.

Possible improvement 1: distribution could be done manually without
scripts at all, because size of subset defined the time of cracking,
so convenient time could be chosen to finish at the end exactly. It
would save time of the coding.

Possible improvement 2: more attention to the list of cracked hashes
should be given. There were 3300+ hashes that could be cracked with
the wordlist. It would be 825M points in 1-2 hours. Also it would
remove 20% of the wordlist speeding up the further cracking.


    Full disclosure: breaking new rule

Our member involved his coworker for 2 hours of cracking. The new
member cracked 14 md5crypts. Most probably, it violated the following
rule:

"Professional teams roster of member must be FIRM before the start of
the contest. There is NO trading of plain-texts between teams."

While we competed in the Pro category, we had casual players in
different years. It is cool to be able to involve people into cracking
that happens right now. OTOH a complete stranger should be added
beforehand to avoid rush and reduce possibility of mistakes.


Thanks for reading!

-- 
Regards,
Aleksey Cherepanov
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.