|
Message-ID: <20150905051639.GA25038@openwall.com>
Date: Sat, 5 Sep 2015 08:16:39 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: MD5 on XOP, NEON, AltiVec
On Sat, Sep 05, 2015 at 07:17:49AM +0300, Solar Designer wrote:
> On Sat, Sep 05, 2015 at 05:25:16AM +0300, Solar Designer wrote:
> > Here's what we had last year:
> >
> > Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 8x]... (8xOMP) DONE
> > Raw: 201472 c/s real, 25152 c/s virtual
> >
> > Here's what we have now:
> >
> > Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE
> > Raw: 150272 c/s real, 18784 c/s virtual
>
> I sort of found it: somehow the code handling SSEi_FLAT_OUT, when
> compiled in, changes the stack frame layout in such a way that
> performance drops. I wasn't yet able to tell why it drops. The
> offsets look properly aligned to me either way.
>
> With SSEi_FLAT_OUT support #if 0'ed out, I get:
>
> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE
> Raw: 223232 c/s real, 27869 c/s virtual
>
> And that's not even with the MD5_I change yet (haven't tried it yet).
With optimized I():
Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE
Raw: 228352 c/s real, 28579 c/s virtual
Patch attached. It also includes MD4 optimizations, which I'll describe
separately.
Alexander
View attachment "john-md5i-md4g.diff" of type "text/plain" (2494 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.