|
Message-ID: <a5a000c0-e357-ea6e-17a7-f131229ef6cb@matlink.fr> Date: Wed, 12 Sep 2018 13:37:31 +0200 From: Matlink <matlink@...link.fr> To: john-users@...ts.openwall.com Subject: good program for sorting large wordlists Le 11/09/2018 à 17:42, Solar Designer a écrit : > Hi, > > On Tue, Sep 11, 2018 at 05:19:18PM +0200, JohnyKrekan wrote: >> Hello, I would like to ask whether someone has experience with good tool to sort large text files with possibilities such as gnu sort. I am using it to sort wordlists but when I tried to sort 11 gb wordlist, it crashed while writing final output file after writing around 7 gb of data and did not delete some temp files. When I was sorting smaller (2gb) wordlist it took me just about 15 minutes while this 11 gb took 4.5 hours (Intel core I 7 2.6ghz, 12 gb ram, ssd drives). > > As to sorting, recent GNU sort from the coreutils package works well. > You'll want to use the "-S" option to let it use more RAM, and less > temporary files, e.g. "-S 5G". You can also use e.g. "--parallel=8". > > As to it running out of space for the temporary files, perhaps you have > your /tmp on tmpfs, so in RAM+swap, and this might be too limiting. If > so, you may use the "-T" option, e.g. "-T /home/user/tmp", to let it use > your SSDs instead. Combine this with e.g. "-S 5G" to also use your RAM. As Alexander said, you should use "--parallel" option for such big files. And yes, you'll need temporary files and then a folder than can handle huge files. I usually sort files of around dozens of gigas, and it takes time but rarely more than 1 hour. -- Matlink - Sysadmin matlink.fr Sortez couverts, chiffrez vos mails : https://café-vie-privée.fr/ XMPP/Jabber : matlink@...link.fr Clé publique PGP : 0x186BB3CA Empreinte Off-the-record : 572174BF 6983EA74 91417CA7 705ED899 DE9D05B2
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.