|
Message-ID: <20160709214614.GA31350@openwall.com> Date: Sun, 10 Jul 2016 00:46:14 +0300 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Loading a large password hash file On Thu, Jul 07, 2016 at 12:38:05AM -0400, Matt Weir wrote: > More of a general question, but what should the default behavior of JtR be > when you give it an unreasonably large password hash file to crack? It doesn't know what's reasonable and what's not on a given system and for a given use case, so it just keeps trying. Do you feel the default should be different? > For example, let's say you give it 270 million Sha1 hashes? This isn't necessarily unreasonable. It should load those if memory permits. I guess this is related to: http://reusablesec.blogspot.com/2016/07/cracking-myspace-list-first-impressions.html In that blog post, you write that after "sort -u" you had an 8 GB file, which means about 200 million unique SHA-1 hashes. So I just generated a fake password hash file using: perl -e 'use Digest::SHA1 qw(sha1_hex); for ($i = 0; $i < 200000000; $i++) { print sha1_hex($i), "\n"; }' which is 8200000000 bytes. On a machine with enough RAM, JtR loaded it in 6 minutes, and the running "john" process uses 13 GB. I guess the loading time could be reduced by commenting out "#define REVERSE_STEPS" in rawSHA1_fmt_plug.c and rebuilding, but I haven't tried that. Maybe we should optimize a few things in that format to speedup the loading. > Currently if I > leave it running for a day or two it just hangs trying to process the file. That's unreasonable. > This was with bleeding-jumbo. > > Aka I realize the hash file was way too big. Heck the file was large enough > I couldn't fit the whole thing in RAM on the machine I was using. Clearly, you need more RAM, or you could probably load half that file at a time. There's also the --save-memory option, which may actually speed things up when you don't have enough RAM. But that's sub-optimal, and high memory saving levels may hurt cracking speed a lot. They also hurt loading time when there would have been enough RAM to load the hashes without memory saving. I've just tried --save-memory=2 on the 200M SHA-1's file, and it looks like it'll load in about 1 hour (instead of 6 minutes), consuming something like 11 GB. So probably not worth it in this case. > I'm more curious about how JtR should respond to that situation. I think the current behavior is fine. There are many OS-specific ways in which the memory available to a process could be limited, and indeed the RAM vs. swap distinction is also system-specific. It'd add quite some complexity to try and fetch and analyze that info, and to try and guess (possibly wrongly) what the user's preference would be. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.