|
Message-ID: <339abcd2990460a240d1dbba12d8af3d@smtp.hushmail.com> Date: Wed, 19 Sep 2012 20:34:38 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: 1.7.9-jumbo-7 It's an Intel CPU (quad i7), so it's just the llvm compiler that can't compete with gcc (let alone icc) when compiling the intrinsics. For some reason I can't use an -x86-64i build either (llvm can't seem to compile the assembler). I'll look into that some day. I can switch to gcc whenever I want and everything will be normal, but I wanted to squeeze out most of the "native OSX" bugs now that I have a chance. magnum On 19 Sep, 2012, at 19:31 , jfoug <jfoug@....net> wrote: > Btw, any idea on why the code which uses the circular temp buffer was slower > on this system? The code I wrote does use just a touch more CPU to roll the > temp vars through a circular buffer, but I would think the memory savings > would more than make up for that, especially (IIRC), since the last few > loops do not write back to memory since it will never be accessed. Possibly > this system simply has a very tiny L1 cache or something, where the memory > stall reduction does not offset the CPU overhead. > > Jim. > >> From: magnum [mailto:john.magnum@...hmail.com] >> >> What you write is true in general, but the bug in question was not about >> that: The SHA_BUF_SIZ I'm talking about only exist in JtR's own sse- >> intrinsics.c code. It's set to 80 for Simon's original 80x4 buffer SSE2 >> SHA-1, and 16 for your later 16x4 code that use buffers similar to MD4 >> and MD5 (except for endianness). Your code is faster on every platform I >> have tried except OSX w/ llvm. So I modifed x86-64.h to use 80 for these >> builds - and that triggered the bug! > >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.