Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 8 Aug 2013 07:30:11 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: --fork using different OpenCL devices

magnum, Claudio, all -

On Wed, Aug 07, 2013 at 09:32:49PM +0200, magnum wrote:
> Claudio had an idea a while ago that I think still hasn't been discussed on list so here goes:
> 
> The idea is to have -fork pick a different device (starting from 0 or picking from a given list) for each child. Picture having two 7990 cards for a total of four devices. Using "-fork=4" with an OpenCL format would pick device 0 for the mother process, device 1 for first child and so on.

This would provide poor man's multi-GPU support.  Unfortunately, in the
current implementation of --fork there's some use of signals - such as
to get the status line printed by all children on a keypress - and this
appears incompatible with AMD's SDK.

> Only very fast formats [where set_key() is a bottleneck] would benefit.

This is confused/confusing.  What I think was meant here is that if we
_don't_ direct the different fork'ed processes to different GPUs (let
them all use one GPU), then we'll hide the latency of key setup and key
transfers.  This is similar to how I sometimes invoke Sayantan's
descrypt-opencl on one GPU multiple times to achieve much better
cumulative speed than is possible with one invocation.  Yes, --fork
would help here (already the current implementation of it, with no
changes), except that there's the issue with AMD's SDK that I mentioned
above.  On NVIDIA GPUs, this just works.

> I think it's a cool idea and Claudio has a trivial PoC patch. Should we do this? It will hopefully be obsoleted by mask mode and other planned things. OTOH I would not mind at all applying it.

I don't see mask mode as obsoleting it.

I don't recall what exactly Claudio's patch did, though.  Like I said,
for hiding the latencies for key setup/transfers with fast hashes, no
patch is needed (but there's an issue with AMD SDK, which we have no
patch for).  However, for poor man's multi-GPU a patch would in fact be
needed (but it will similarly be problematic with AMD SDK).

Maybe we should revise --fork such that it would not use signals (would
use solely other IPC mechanisms).  Or maybe AMD will fix their SDK soon
(wishful thinking).

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.