|
Message-ID: <50FAA598.6040508@gmail.com> Date: Sat, 19 Jan 2013 11:54:32 -0200 From: Claudio André <claudioandre.br@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Mulit-gpu using claudio's interfaces Em 19-01-2013 07:55, magnum escreveu: > On 19 Jan, 2013, at 9:04 , Sayantan Datta <std2048@...il.com> wrote: > >> I have integrated the claudio's interfaces in mscash2-opencl: >> >> std2048@...l:~/Jtr3/run$ ./john -te -fo=mscash2-opencl -dev=0,1 >> Device 0: GeForce GTX 570 >> Optimal Work Group Size:256 >> Kernel Execution Speed (Higher is better):0.475557 >> Device 1: Tahiti (AMD Radeon HD 7900 Series) >> Optimal Work Group Size:128 >> Kernel Execution Speed (Higher is better):1.556817 >> Benchmarking: M$ Cache Hash 2 (DCC2) PBKDF2-HMAC-SHA-1 [OpenCL]... DONE >> Raw: 125427 c/s real, 125427 c/s virtual >> >> std2048@...l:~/Jtr3/run$ ./john -te -fo=mscash2-opencl -dev=gpu >> Device 0: GeForce GTX 570 >> Optimal Work Group Size:128 >> Kernel Execution Speed (Higher is better):0.474687 >> Device 1: Tahiti (AMD Radeon HD 7900 Series) >> Optimal Work Group Size:128 >> Kernel Execution Speed (Higher is better):1.557031 >> Benchmarking: M$ Cache Hash 2 (DCC2) PBKDF2-HMAC-SHA-1 [OpenCL]... DONE >> Raw: 126030 c/s real, 125728 c/s virtual > Good stuff. Ideally we should have a GTX 590 or GTX 690 and an AMD 7990 in Bull so we could test multi-device CUDA as well as multi-device OpenCL with homogenous or heterogenous devices. To do the debug, absolutely. > I'm not sure about iterations and key length, but shouldn't mscash2 ideally perform similar or better than wpapsk? > > Also, there's some problem using three devices: > > ../run/john -t -fo:mscash2-opencl --dev=all > Device 0: GeForce GTX 570 > Optimal Work Group Size:128 > Kernel Execution Speed (Higher is better):0.474681 > Device 1: Tahiti (AMD Radeon HD 7900 Series) > Optimal Work Group Size:256 > Kernel Execution Speed (Higher is better):1.556828 > Device 2: AMD FX(tm)-8120 Eight-Core Processor > Optimal Work Group Size:4 > Kernel Execution Speed (Higher is better):0.001337 > Benchmarking: M$ Cache Hash 2 (DCC2) PBKDF2-HMAC-SHA-1 [OpenCL]... Segmentation fault > > Maybe they are just too different in speed. > > magnum The '--dev=all' is something that bothers my mind on a real run. If you have free time and can debug it and get more information, it is going to be useful. It could be some bug in common code, as well. ----- BTW: is this hard? (for example). my_formar_crypt_all() // crypt_all inside a format file. { ... //Put work on N cards enqueue() ... while (anyGpu.hasWork_toDo) { // **************************************************************** // ***** (This, so) core can check all candidates from 16000 to 32000, while any other tasks inside other GPUs have not finished. if (Gpu[x].finished) core_send_event(result = WORK_DONE, start = 16000, finish = 32000) } ---- Ok, the (only one)/main thread part is a limitation. Claudio
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.