john-users - Re: JtR vs. hashcat on /r/crypto

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <168e182.5148.1396fef8583.Webtop.0@cox.net>
Date: Tue, 28 Aug 2012 21:12:11 -0400 (EDT)
From: jfoug@....net
To: john-users@...ts.openwall.com
Subject: Re: JtR vs. hashcat on /r/crypto




On Tue, Aug 28, 2012 at 5:41 PM, Solar Designer wrote:

> On Tue, Aug 28, 2012 at 02:21:07PM -0500, jfoug wrote:
>>> From: Solar Designer [mailto:solar@...nwall.com]
>>>
>>> 2. It turns out (was news to me) that hashcat added SunMD5 support
>>> recently (on CPU).  According to atom, it does not use SIMD, yet is
>>> faster than ours with SIMD (JimF's unreleased code in magnum-jumbo).
>>> I've asked atom for specific speed numbers, but we might want to do 
>>> our
>>> own benchmarks as well (Jim?), if we don't mind running the closed-
>>> source hashcat for that. ;-)
>>
>> I have a strong belief the coin flip logic we have (the original sun 
>> logic),
>> is where the speedup can be found. Yes, we did remove a %5 in one of 
>> the
>> loops.  But there still has to be a LOT of optimization left. There 
>> is a lot
>> of temp memory usage, and memory movement.  It 'could' be some other 
>> factor,
>> but I really think not.  This is why I was surprised by only a 3.5x
>> improvement when going to SSE2 code.  I expected a much higher rate, 
>> since
>> we modify that large buffer so little.
>
> Well, it's 4.4x with XOP, but I wouldn't be surprised by a higher
> speedup (over the original Sun code or equivalent) with further
> optimizations on top of SIMD usage.  What surprises me is that atom 
> says
> he achieved greater speed "by not using SIMD".

The SIMD was 'hard', due to having to find, load, process and later 
unwind the 2 different sized input buffers. However, since there was 
only a 16 byte block that was modified, this wind/unwind actually is 
very trivial to do (CPU cost wise).


>> Possibly there is something Atom was able to find, that busted the 
>> coinflip,
>> and found some way to compute it in a deterministic (or nearly
>> deterministic) manner.
>
> Even if so, I don't see how that alone would provide more than a ~4x
> speedup without going SIMD.  Didn't the original Sun code use the
> non-SIMD MD5 code fairly optimally (well, except for wasting a little
> bit of time on the modulo division and such)?

Not in my opinion.  There is a lot of array loading of temp values. 
There HAS to be a much better way to work down to that 1 bit.

> Maybe I need to take a closer look at the code myself, but for now 
> I'll
> just wait to see the performance numbers for hashcat's SunMD5. 
> Perhaps
> someone can try it out and post in here?

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.