john-dev - Re: Signal handling within john formats.

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150530050855.GA17099@openwall.com>
Date: Sat, 30 May 2015 08:08:56 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Signal handling within john formats.

On Fri, May 29, 2015 at 06:40:49PM +0530, Sayantan Datta wrote:
> On Fri, May 29, 2015 at 2:05 PM, Solar Designer <solar@...nwall.com> wrote:
> > How exactly setitimer() interacts with alarm() may vary across systems.
> > You should not mix them.  You may do what you need with setitimer()
> > alone, but even then you still have a race condition between replacing
> > the signal handler and replacing the timer settings (and ditto when
> > restoring them).  To avoid it, you may block/unblock the signal.  Once
> > finally implemented correctly, this should actually work... but it gets
> > complicated.
> 
> It's not that I must go according to my plans use alarm() or setitimer(),
> but just out of curiosity, I'd like to know why there is a race condition
> between replacing the signal handler and replacing the timer settings. If
> the process is single threaded, then how does the race condition arise.

If you replace the signal handler first and the timer settings next,
then it may happen that your new signal handler will be called when the
old timer ticks (before you had a chance to replace the timer settings) -
perhaps way sooner than you wanted - and then once again when your new
timer ticks.  Perhaps this will result in behavior other than what you
intended, especially if the old timer was about to tick.

If you replace the timer settings first and the signal handler next,
then it may happen that the old signal handler will be called when the
new timer ticks (before you had a chance to replace the signal handler).
This is unlikely under sane load, but may happen when the system is
heavily overloaded or when your process is at a very low scheduling
priority and there are enough higher priority processes to almost stop
yours from proceeding to making its next syscall until the timer expires.
If this has occurred and your new timer is one-off, then your new signal
handler will never be called.

This issue is present both when replacing and when restoring these two
things.  As an alternative to blocking the signal, you may first reset
the timer so that it won't trigger, then replace the signal handler,
then set (or restore) the timer.

A classic example of a similar issue is when using alarm() to interrupt
syscalls such as connect().  Normally, unhandled SIGALRM interrupts
blocking syscalls, making them return EINTR.  This is commonly used to
implement timeouts when attempting to connect to a remote server (which
might be down), etc.  However, under heavy load or with extremely low
scheduling priority, the alarm might go off before the next syscall
enters the kernel - resulting in the process getting stuck without the
intended timeout should the syscall actually block.

> > As an alternative, you could add support for registering
> > of your additional handlers (or just one additional handler) to the
> > global alarm handler in JtR.  This would be JtR-specific, which is good
> > in that it won't introduce dependency on extra features of the
> > underlying system working correctly.
> 
> I don't understand! Till now, I only know that there could only be one
> signal handler at any moment for every signal, in my case SIGALRM. So how
> can I register additional handler for SIGALRM and how would the system know
> which one to use ?

The system would not know.  I was referring to a feature you could add
to JtR's signals.c, to have it invoke an extra handler out of the one
handler that it registers with the kernel.

BTW, here's yet another alternative, unrelated to the above: there's
also a separate timer and a separate signal you could use -
ITIMER_VIRTUAL and SIGVTALRM.  We don't use these in JtR yet, so there
would be no conflict if you use them.  Of course, this is only
appropriate if you're OK with it being based on virtual (CPU) rather
than real time, and if you're OK with depending on this functionality
being supported by the underlying system.

Of course, my recommendation remains that you reconsider and somehow
avoid needing to do this.

> > Better yet, though, you'd avoid the need for this.  Why exactly do you
> > need it?  Let's discuss the actual problem you're trying to solve first.

> This is a part of my experiment/learning exercise/tinkering scheme and it
> may or may not make its way into jumbo. The problem is, I'm using a
> randomized algorithm and sometimes it is profitable(performance wise) to
> bail out and restart the algorithm with new set of parameters. When I see
> there is not enough progress within a stipulated amount of time, I'd like
> to bail out and restart.

OK.

> This is why I need alarm()/setitimer().

This is not convincing.  Why not base your logic on number of iterations
of your algorithm (when it reaches a threshold, restart) or, if you
really want to base it on real time, then why not e.g. check an integer
variable once per crypt_all() and increment it in sig_handle_timer()?

> I need to
> run the algorithm within fmt->restart() and within fmt->crypt_all() once in
> a while(i.e after some hashes are cracked).

OK, so that's where you'd check your iterations counter or the timer
ticks counter variable.

Out of curiosity: is your randomized algorithm trying to optimize the
candidate passwords stream (in other words, making it a specialized
cracking mode) or the hashing speed (such as continuously re-tuning it
for whatever candidate passwords it actually sees and for the target
compute device)?

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.