|
Message-ID: <CABob6irSwZK+60=K8nqOvn+oN1vLi-oQMGpO3qybWtr_50UT6w@mail.gmail.com> Date: Thu, 25 Aug 2011 19:15:25 +0200 From: Lukas Odzioba <lukas.odzioba@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Lukas's Status Report - #15 of 15 2011/8/25 Solar Designer <solar@...nwall.com>: > After applying your john-1.7.8-mscash2cuda-0.diff, I changed: > > mscash2_init(1); > > to: > > mscash2_init(0); > > or it was failing trying to use a non-existent second GPU as far as I > could tell. Do you have two NVidia GPUs in your machine now? :-) Yes I've got 460 in pcie 16x and 9800gt in pcie 4x. This bug was because of code duplication. Every format had own init function. In the next revision there will be one common init with command line parameter. > Running two instances at once, I got: > > Raw: 784 c/s real, 819 c/s virtual > Raw: 934 c/s real, 961 c/s virtual > > which is slightly faster (1700 c/s combined). On my PC cpu time is 0.6% of all. With slower cpu (in terms of thread speed) it might be even more, so small cpu_speed/gpu_speed increases gpu iddle time. I can divide computation on two parts and compute second cpu part during first gpu part execution. For now it is sequentialy do_cpu -> do_gpu. > Overall, this feels somewhat slow - comparable to a quad-core CPU. > There's probably a lot of room for optimization. > > Your 8160 c/s for a faster GPU is much better, though. :-) Mscash2 and sha512crypt kernels requires quite a lot registers, and because Fermi have got more of them is able to run more threads at once and better hide memory latency. >> Patch is configured for older devices (sm=10,128threads) to be more >> portable. As Solar stated only pbkdf2 is on gpu side. > > Yet you implemented the on-CPU mscash portion of mscash2 in the .cu > source file - wouldn't it be cleaner/easier to have it in .c? (Maybe > this is how it should be. I am merely asking.) I assumed that it will be better to have all computing code in one file rather than cpu, gpu and common function preproc duplicated in two separate files. >> It is basicly Sn3f's implementation with JimF's optimizations, and >> it's not (yet) fully optimal. I estimate that optimal should do around >> 13k c/s on gtx460. > > How did you arrive at this estimate? Someone stated on john-contest list that AMD5870 running oclhashcat is doing 59k c/s. I took Ivan Golubev's sha1 estimations for both cards and compared results. Yes it's not exact but gives some overview. Lukas
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.