|
|
Message-ID: <20160709214614.GA31350@openwall.com>
Date: Sun, 10 Jul 2016 00:46:14 +0300
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Loading a large password hash file
On Thu, Jul 07, 2016 at 12:38:05AM -0400, Matt Weir wrote:
> More of a general question, but what should the default behavior of JtR be
> when you give it an unreasonably large password hash file to crack?
It doesn't know what's reasonable and what's not on a given system and
for a given use case, so it just keeps trying. Do you feel the default
should be different?
> For example, let's say you give it 270 million Sha1 hashes?
This isn't necessarily unreasonable. It should load those if memory
permits. I guess this is related to:
http://reusablesec.blogspot.com/2016/07/cracking-myspace-list-first-impressions.html
In that blog post, you write that after "sort -u" you had an 8 GB file,
which means about 200 million unique SHA-1 hashes. So I just generated
a fake password hash file using:
perl -e 'use Digest::SHA1 qw(sha1_hex); for ($i = 0; $i < 200000000; $i++) { print sha1_hex($i), "\n"; }'
which is 8200000000 bytes. On a machine with enough RAM, JtR loaded it
in 6 minutes, and the running "john" process uses 13 GB.
I guess the loading time could be reduced by commenting out "#define
REVERSE_STEPS" in rawSHA1_fmt_plug.c and rebuilding, but I haven't tried
that. Maybe we should optimize a few things in that format to speedup
the loading.
> Currently if I
> leave it running for a day or two it just hangs trying to process the file.
That's unreasonable.
> This was with bleeding-jumbo.
>
> Aka I realize the hash file was way too big. Heck the file was large enough
> I couldn't fit the whole thing in RAM on the machine I was using.
Clearly, you need more RAM, or you could probably load half that file at
a time.
There's also the --save-memory option, which may actually speed things
up when you don't have enough RAM. But that's sub-optimal, and high
memory saving levels may hurt cracking speed a lot. They also hurt
loading time when there would have been enough RAM to load the hashes
without memory saving. I've just tried --save-memory=2 on the 200M
SHA-1's file, and it looks like it'll load in about 1 hour (instead of
6 minutes), consuming something like 11 GB. So probably not worth it in
this case.
> I'm more curious about how JtR should respond to that situation.
I think the current behavior is fine. There are many OS-specific ways
in which the memory available to a process could be limited, and indeed
the RAM vs. swap distinction is also system-specific. It'd add quite
some complexity to try and fetch and analyze that info, and to try and
guess (possibly wrongly) what the user's preference would be.
Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.