|
Message-ID: <555A56FB.8020101@gmail.com> Date: Mon, 18 May 2015 23:17:47 +0200 From: Marek Wrzosek <marek.wrzosek@...il.com> To: john-users@...ts.openwall.com Subject: Re: Advise on best approach (truecrypt pw based on pdf file) Hi Finally I got it. Two (almost) perfect filters. Almost because they are made from two awk commands and one has grep in it also. First will print only once every substring that start with capital letter. First awk ignores 'I' when searching for next substring, so second awk will repeat all substrings that start with 'I'. And this is that first filter: env LC_COLLATE=C awk '{line=$1;while(match(line,/[A-Z][a-zI]+/)>0){print substr(line,RSTART,RLENGTH);line=substr(line,RLENGTH+1);}}'|awk '{line=$1;print line;while(match(line,"I")>0){if(RSTART>1){line=substr(line,RSTART);print line;}else line=substr(line,2);}}' Second filter is almost like that from previous e-mails but it will not stop after 12 letters. It consists first awk from previous filter but second awk comes from previous e-mail (with grep). This is second filter: env LC_COLLATE=C awk '{line=$1;while(match(line,/[A-Z][a-zI]+/)>0){print substr(line,RSTART,RLENGTH);line=substr(line,RLENGTH+1);}}'|awk '{for(i=1;i<=length($1);i++){for(j=i+4;j<=length($1);j++){print substr($1,i,j-i+1);}}}'|grep '^[A-Z]' You can use unique command from JtR to get rid of repeated passwords. I think that passwords from first filter (after unique) can be good as training sequence for Markov model (but I'm not sure). Best Regards W dniu 18.05.2015 o 20:50, Demian Smith pisze: > Hi Marek, > > thanks a mill for the quick reply - slowly this is feeling like I have > taken over the john list :p > > I am creating the wordlist right now with your new command, if that > fails, I will run Markov, if that fails as well, I run incremental > forever and a day ... > >> I've been thinking about rewriting this awk command to search a capital >> letter other than 'I' at the end of string to break the inner loop. >> Maybe I'll get rid of grep and change loop type. > > I hope all of this will help someone else eventually as well =) and I > can't point out enough how grateful I am for the ongoing help on this list > > Best regards and thank you ever so much, > Demian > > > > > > ★ On 15/05/18 07:42 p.m. Marek Wrzosek wrote ★ >> Hi Demian >> >> There should be space between grep and '^[A-Z]'. The ^[A-Z] regular >> expression is for searching lines with a capital as first letter. >> I've been thinking about rewriting this awk command to search a capital >> letter other than 'I' at the end of string to break the inner loop. >> Maybe I'll get rid of grep and change loop type. >> >> Best Regards >> >> W dniu 18.05.2015 o 18:32, Demian Smith pisze: >>> Hi Marek, >>> >>> I tried Markov over night , but it doesn't look really good - I had >>> trained it on a pwd file generated from the pdf (with keeping only first >>> letters of each word), but it cretaed mostly candidates like >>> titttttttttttaief >>> >>> So, I wanted to get back to the awk version and run it on a similar file >>> created from the pdfs in the relevant folder, alas, I get >>> cat all5 | awk '{for (i = 1; i <= length($1); >>> i++){for(j=i+4;j<i+12&&j<=length($1);j++){print substr($1, i, >>> j-i+1);}}}' | grep'^[A-Z]' > all6 >>> >>> grep^[A-Z]: command not found >>> >>> I don't know enough about bash programming to sort this one out and >>> hence would come back to your advise, if you don't mind ... >>> >>> Thanks, >>> Demian >>> -- >>> 'It's no measure of mental health to be well adjusted >>> to a profoundly sick society.' >>> >>> Sinéad O'Connor >>> >>> ★ On 15/05/18 04:57 a.m. Marek Wrzosek wrote ★ >>>> awk '{for (i = 1; i <= length($1); i++){for >>>> (j=i+4;j<i+12&&j<=length($1);j++){print substr($1, i, j-i+1);}}}'|grep >>>> '^[A-Z]' >> -- Marek Wrzosek marek.wrzosek@...il.com View attachment "first" of type "text/plain" (262 bytes) View attachment "second" of type "text/plain" (239 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.