|
Message-ID: <20240815170241.GA14897@openwall.com> Date: Thu, 15 Aug 2024 19:02:42 +0200 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: problem running out of memory - john 20240608.1 On Thu, Aug 15, 2024 at 03:32:29PM +0200, Solar Designer wrote: > One way to approach this problem is to not split that file. Try loading > them all at once, however don't use --fork at first. Maybe this will > fit in memory for you. As most hashes will get cracked after a while, > you'll be able to slowly increase the --fork count for further attacks > (but those should be different attacks). > > I've just tested on Linux, and 100M of unique NTLM hashes take 11 GB RAM > with default settings, taking about 1 minute to load. So HIBPv8 847M > will probably fit in 128 GB RAM. I've just tested loading the whole HIBPv8 on Linux, and it works: time ./john --format=nt --verbosity=1 --wordlist --rules --dupe=0 --no-loader-dupe-check pp8 No dupe-checking performed when loading hashes. Using default input encoding: UTF-8 Loaded 847223402 password hashes with no different salts (NT [MD4 128/128 AVX 4x3]) One important option I keep forgetting to use at first is --no-loader-dupe-check. With this option, the above took 8 minutes. Since it's known that HIBPv8 has no duplicate hashes, no need to waste time checking for duplicates at loading. It looks like our duplicate hash checks scale really well to 100M, but stop scaling nearly so well at hundreds of million. Anyway, the above uses 46 GB RAM. It cracked 2M hashes in first 6 seconds, 5M in 14 seconds, 50M in 3.5 minutes, 100M in 13 minutes, 130M in 25 minutes, 140M in 32 minutes, 146M by attack completion in 38 minutes. So even on one CPU core (in this case, in an old Xeon E5-2670), you can eliminate lots of hashes in under an hour: 146633482g 0:00:38:09 DONE (2024-08-15 17:46) 64055g/s 2171Kp/s 2171Kc/s 1587TC/s Robyn2638..Sambarock38 Session completed. real 46m11.607s user 44m26.731s sys 1m16.396s Looks like a moderate --fork count could be used right away, but it's hard to tell exactly which. On a 128 GB RAM machine, certainly at least 2 would work right away, but probably more. The loaded hashes are initially shared between the forked processes, but as more hashes get cracked the in-memory "databases" become more diverse between processes, which then uses more memory (copy-on-write). Another option you could use, but I forgot to use above, is --no-log. I got a 7 GB john.pot and a 3 GB john.log. The latter could be avoided if not needed, which would also speed things up a bit. Another relevant option is --save-memory, but I don't recommend using it in this case since it'd likely slow things down a lot per-process, while allowing probably only for a moderately higher process count. Now, to proceed further with the remaining hashes, you could simply be loading the whole file again, and the checks against john.pot would remove the already cracked hashes from further cracking. However, they may still waste memory. So you could once in a while use --show=left to obtain the remaining hash list: ./john --format=nt --show=left pp8 > pp8-left (This will prefix them with "?:$NT$", for username placeholder and hash type identifier, but it should not matter.) However, --show=left after the above appeared to take ages, and it wouldn't accept --no-loader-dupe-check on the command line. What helped is setting: NoLoaderDupeCheck = Y in john.conf (just don't forget to revert this edit when you work on smaller hash lists that may have duplicates). With this, it completed in 46 minutes, using 44 GB of RAM. I must admit that's rather long. 146633482 password hashes cracked, 700589920 left real 46m1.249s user 37m59.769s sys 8m1.868s As expected, pp8-left has 847223402-146633482 = 700589920 lines. Since this file already has the cracked hashes removed, you don't need to use the existing john.pot when loading it for further attacks. You can use the --pot option to specify an alternative pot file name, which will speed up the loading. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.