john-users - Re: External filter question

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201109121035.10348.pfernandez@13t.org>
Date: Mon, 12 Sep 2011 10:35:09 +0200
From: Pablo Fernandez <pfernandez@....org>
To: john-users@...ts.openwall.com
Subject: Re: External filter question

Hi,

Thanks a lot for your answers!

> This sounds even worse than the [List.External:Parallel] sample, which
> is pretty bad on its own.  I recommend that you consider other approaches:

Actually, I have been making myself some numbers, and it turns out to be 
better than Parallel, given the same conditions: jobs will have a limited 
duration.
The reason for being better is because each job only "skips" until it's its 
turn, and then finishes. On the Parallel all jobs "skip" all the time. For 
example, with blocks, given you want to test the first 10k passwords with 10 
jobs:
- First jobs doesn't skip. It computes 1k passwords and exits
- Second job skips 1k, computes 1k, exits
- Last job skips 9k, computes 1k, exits.
All in all, time spent skipping: 0+1+2+3+..+9 = 45k skips.
The same, with Parallel, 10k passwords, 10 jobs:
- All jobs compute 1k, and skip 9k.
All in all, time spent skipping: 10*9 = 90k skips.

Does this make sense?


> Is your reason to go for "blocks" instead of having each node try every
> Nth password (like [List.External:Parallel] does) to be able to retry
> skipped/failed blocks from a master node, and to adjust the number of
> nodes on the fly?  If so, this makes sense.  But having each node skip
> to its starting password number each time it's invoked on a relatively
> small block is going to kill performance unless you're running against
> salted and extremely slow hashes.

Indeed, yes. I wanted to make it variable-size blocks, depending on the 
"expected" free time to use in the cluster. And also make it variable number 
of compute nodes, you never know... so, as flexible as possible. No 
MPI/OpenMP, by design. And Markov is a bit too strict.

Anyway, Parallel or Block are not too bad, if you have slow hashes (with Linux 
- my target system - you barely see DES) and/or you have many salts.


> Worse, it will also not even find some passwords, because there's not
> only I/O buffering, but also crypto algorithm related buffering in JtR.

Is there a "safe" limit I can use? I could make the block to perform, let's 
say, 200 more hashes than it should (if would be less than a second) even if 
it overlaps with the next block. What I can't admit is to perform *less* than 
it should.


> We might introduce a way for filter() to ask for process termination in
> a later version.  In fact, maybe external mode functions should also be
> able to ask for the status line to be displayed (simulate keypress).

Those two would be awesome to have! Any idea on how to do it? Maybe I could do 
my work on a "patched" version, pretending to work also in a future release.

 
> > - Or maybe, is there anything I can do to force John to flush the IO
> > buffer before doing the word=1/0 operation?
> No, not from external mode, and it's not only about the I/O anyway.

I have one question here... what happens if you do a sigkill? Does it exit 
nicely (IO/crypt buffers)? Is there a way to send a message to an external 
application, so that it can send the sigkill back to john?


Thanks again,
Pablo
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.