Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 14 Sep 2015 23:43:31 +0300
From: Solar Designer <>
Subject: Re: md5crypt mmxput*()

On Mon, Sep 14, 2015 at 09:46:56PM +0200, magnum wrote:
> BTW the total size of simd-intrinsics.o (after stripping) actually 
> increased. I'm not sure how to get detailed figures (eg. per function or 
> something?).

With "nm -S":

What matters even more is the size of the loops, excluding any
relatively rarely performed initialization.  For example, for md5crypt
the size of initialization before the 1000 MD5s loop doesn't matter as
much as the size of that 1000 MD5s loop - but both are counted towards
size of just one function (after inlining).

What also matters is how the various blocks of code are re-ordered by
the compiler.  We could try using gcc's __builtin_expect(), like we
already do in compiler.c's virtual machine.  We could wrap them in
likely() and unlikely() macros like the Linux kernel uses, and put those
in common.h.  This could reduce the address range corresponding to the
loop's body (with infrequently used conditional blocks moved to outside
of that range), which might help on CPUs with low L1 instruction cache
associativity (like Bulldozer).


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.