|
Message-ID: <87B97CE45CFD4B96B9F0F9587764CEB1@apple9d23c8f76> Date: Wed, 12 Sep 2018 14:07:04 +0200 From: "JohnyKrekan" <krekan@...nykrekan.com> To: <john-users@...ts.openwall.com> Subject: Re: good program for sorting large wordlists My question now is not about sorting but about the wordlist which you would use for hash testing and is already saved on disk, the smaller (with all words lowercase) or the bigger (mixed). Is it better to let the program for example EWSA make the case modifications or use bigger one and disable all the case modifying rules. Johny Krekan ----- Original Message ----- From: "Solar Designer" <solar@...nwall.com> To: <john-users@...ts.openwall.com> Sent: Wednesday, September 12, 2018 1:49 PM Subject: Re: [john-users] good program for sorting large wordlists > On Wed, Sep 12, 2018 at 01:14:22PM +0200, JohnyKrekan wrote: >> Thanx for infos, after I have raised the memory sizes and the space for >> temp, the sort went well. Iwas sorting it to know how many duplicates >> (when >> ignoring the character case) are in the superwpa wordlist. The original >> file size was approx 10.7 gb, after sorting it was 7.05 gb, so 4 gb was >> taken by the same words with modified character case. > > It's a case where you don't need to sort. You could use: > > ./unique -v output.lst < input.lst > > or e.g.: > > tr 'A-Z' 'a-z' < input.lst | ./unique -v output.lst > > Testing this on JtR's bundled password.lst: > > $ tr 'A-Z' 'a-z' < password.lst | ./unique output.lst > Total lines read 3559 Unique lines written 3422 > > If you're interested in sizes in bytes as well, use "ls -l" or "wc -c" > on the two files. > > For tiny wordlists like password.lst, "sort -u" is more convenient in > that it can output to a pipe, so you can do: > > $ tr 'A-Z' 'a-z' < password.lst | sort -u | wc -l > 3422 > > But for large wordlists "sort" may be slower, even with the "-S" and > "--parallel" options. > > Alexander > >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.