john-users - Re: New 'mode' in JtR external

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <56DF0281.4020701@openwall.net>
Date: Tue, 8 Mar 2016 10:49:05 -0600
From: jfoug <jfoug@...nwall.net>
To: john-users@...ts.openwall.com
Subject: Re: New 'mode' in JtR external

There is a new hybrid-external mode added to bleeding-jumbo.

There are 2 new external functions added:    next() and new()
There are 2 new global variable added:       hybrid_total and hybrid_resume

The functions work like this:

new() is called at each new word generated by other means (mask, 
incremental,
     wordlist, etc).
   - When new() is called, the global array word[] will be setup with the
    'base' word.
   - new() should set the hybrid_total global to the total number of words
     that willbe generated from word[], if that can be easily computed.
   - newshould do any other saving of data into the scripts global variables
     which will be needed to later create the words.
   - new() may not be called after a restore().  See explanation in 
restore()

next() is called repeatedly within a cracker loop.
   - next() should transform the global data and return (in the word[] 
array)
     the next iteration of the base_word with whatever modifications the
script is supposed to do.
   - when next() is called and there  are NO more values to return, word[0]
     should be set to 0. This is the indication that this word has been
     completely processed.

filter()  is called after next, and before the word is used. Filter can
itself'modify' the word, or say to NOT use this word.
   - filter() is an optional function. If it is not present, then all words
     are used.

restore()  The restore() function should do whatever work is required to
     put the script's global data into a state ready to generate the word
     just after where we left off.
   - the word[] array is the original base-word.
   - the hybrid_resume is the number of times which next() was called on
     this word. When the next call to next() is done, the script should
     return the correct value, and then each subsequent call to next should
     also return the correct data, i.e. the same list as if we had not
saved and restored the session.
   - restore() must set the global variable hybrid_total to the correct
     total count of items which the script would process for this base-word.
     NOTE, this is the total count, NOT the number of words still left to
     be output.
   - The cracker will compare hybrid_total with what it was told originally
     when new() was called on the original run (before the restore).   If
     this count matches, then the new() function will NOT be called (since
     the script state should already be setup properly).
   - If this count does not match, then the cracker will resume by doing
     a call to new() and then hybrid_resume number of calls to next() within
     the script, but doing nothing with the data. This will allow the 
cracker
     to 'brute force' restore.
   - So, if the script is some function which really is NOT easy to restore
     to some random location within the word stream, then all the script 
user
     needs to do is have an empty restore() function, and within the new()
     function, set the global hybrid_total to 0, telling the script that we
     have no easy way to restore to a random spot.  When the cracker sees
     this, it simply does the brute force restore method.

Following the above rules, this new external 'hybrid' mode can perform
anything that can be done in the external scripting language.

Some of the things I have thought of, are redoing some of the existing
external generators, but being able to easily provide setting of global
data.  For instance:

a smart keyboard mode could be made, where the input line was formatted
with the starting string to prime the keyboard with, with a min-length
and a max-length.  Keyboard is a great mode, but there are many 'dead'
zones in the full working set, and some pretty good zones. So we could
prime the pump, and run keyboard, but only running it for 'q' 'a' '1' 'h'
'z' 'f' 'k' starting letters.  Thus, we would run that mode but only on
some better 'hot spot' data. We would skip running it on '>' or '}'
type starting letters, not because they are never seen as passwords,
but that they are seen much less often than something like qwert4321
or qwsaqwsaqwsa.

Also, many users use Perl filters (or other filters). Some of these can now
now easily be done using external-hybrid mode, and one great benefit, is
that john will properly save and later resume a run easily, where with a
filter, that is not done automatically.

Also a very strong 1337 speak converter can be easily done. This has been
worked on within john's rules, but there are limitations there on just how
robust this form can be. It is very hard to build efficient rules for this,
and almost impossible to build rules that only 133t SOME of the letters,
leaving others alone. So getting the word 600b!35 (boobies) is very hard
with rules, since there is a b->6 and a b->b in there.  Some of this can
also be done using rexgen, but that is a non-default library, and actually
pretty slow. The external script language can be written to be as fast or
faster than rexgen.

There is a very trivial, commented 'example' external script in john.conf,
showing how to use hybrid-external scripts.  This script was used in
testing, and to make sure that things like save/resume were working
properly during development.  It is not an exceedingly useful script,
it was simply done to document the script writing.

This version can be found at:
https://github.com/magnumripper/JohnTheRipper/archive/bleeding-jumbo.zip


On 3/3/2016 5:31 PM, jfoug wrote:
> I have added a new 'mode'  (still working on it, but it is actually 
> running already) to JtR's external scripting language.
>
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.