john-dev - Re: AMD Bulldozer and XOP

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120315214523.GA9644@openwall.com>
Date: Fri, 16 Mar 2012 01:45:23 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: AMD Bulldozer and XOP

On Thu, Mar 15, 2012 at 11:24:07PM +0200, Milen Rangelov wrote:
> I can help with some kernels. In fact, JtR is very inspiring project. I
> like to look at how people solved similar problems often in different ways.

Great.

> So 256-bit XOP is slower than 128-bit one?

According to benchmarks that were sent to me before I got a Bulldozer of
my own, yes - about twice slower per bit (four times slower per
instruction).  (I haven't tested 256-bit XOP on my own FX-8120 yet.
Will do so a bit later.)

> This reminds me of SSE2 and some old Pentium 4 CPUs :)

I think you mean SSE and Pentium 3.  Yes, that was disappointing.  In
fact, the cause might be similar: officially, those wider registers and
operations on them are "floating point" (true for both the original SSE
and now for 256-bit AVX and XOP), so there might be some overhead on
updating some CPU-internal floating-point state (flags reflecting the
current values in the vector elements if interpreted as floating-point?)
That's just a guess, though.

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.