Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <a89d8e.8a12.13990880b2b.Webtop.0@cox.net>
Date: Tue, 4 Sep 2012 05:06:38 -0400 (EDT)
From: jfoug@....net
To: john dev <john-dev@...ts.openwall.com>
Subject: Re: memory usage (was: [john-users] JTR against 135
  millions MD5 hashes)

On Tue, Sep 4, 2012 at 1:57 AM, magnum wrote:
> On 4 Sep, 2012, at 5:47 , Solar Designer <solar@...nwall.com> wrote:
>
>> Jim -
>>
>> On Mon, Sep 03, 2012 at 11:12:18PM -0400, jfoug@....net wrote:
>>> On Mon, Sep 3, 2012 at 8:49 PM, Solar Designer wrote:
>>>> I did not realize that there were wasteful allocations in prepare() 
>>>> and valid().  Weren't they temporary?
>>>
>>> They were done using alloc_memory_tiny.  Now, I simply use static 
>>> ....
>>
>> Did you include fixes for this in the fixes branch?  If not yet, can 
>> you
>> do that?  I guess those changes are non-invasive.
>
> We were not using mem_alloc_tiny but alloc/free so the problem was not 
> so much wasted memory but fragmentation. Anyway, it is improved in all 
> branches including -fixes.

If you look back to before March or so, I am pretty sure there were 
mem_alloc_tiny calls being used. Those are all gone now, and simple 
local static  var's are used, or in some cases, even 'normal' stack 
local char arrays.  alloca() would be another option.

>>>> Another potential source of memory usage reduction are the 
>>>> alignments.
>>>> For raw MD5 hashes, a 4-byte alignment should suffice (they're 4x4
>>>> bytes), yet we were using 8-byte alignment on 64-bit builds.
>>>
>>> Very good point. I had not even thought of things like these new 
>>> alignment requirements.
>
> I do not know of *any* format using larger alignment than ARCH_WORD_32 
> except for salts (where we sometimes pass a pointer).

Do the builds drop to binary align=1 on intel boxes?  We probably 
should, or at least allow a compile define to get this, in case it 
causes a slight runtime slowdown.  Even for salts, I think we should 
drop to align=1 on systems allowing non-aligned access, and fix any 
formats which core due to actually having alignment issues (such as SIMD 
access).  The requires aligned systems (like sparc) but even there, it 
might be better to have the default be 1, and fix any format with 
issues.  Possibly it would be better with more than 1 default value.

// note, just coded in the email, not fully thought through or tested.
#define DEFAULT_ALIGN 1
#ifdef allow_unaligned
#  ifdef force_align_build
#    define DEFAULT_ALIGN32   4
#    define DEFAULT_ALIGN_ARCH_WORD sizeof(ARCH_WORD)
#  else
#    define DEFAULT_ALIGN32   1
#    define DEFAULT_ALIGN_ARCH_WORD 1
#  if defined (MMX_COEF) || defined (force_align_build)
#    define DEFAULT_ALIGN64   8
#    define DEFAULT_ALIGN128 16
#  else
#    define DEFAULT_ALIGN64   1
#    define DEFAULT_ALIGN128 1
#  endif
#else
#  define DEFAULT_ALIGN32   4
#  define DEFAULT_ALIGN_ARCH_WORD sizeof(ARCH_WORD)
#  define DEFAULT_ALIGN64   8
#  define DEFAULT_ALIGN128 16
#endif

Then for most formats which access binary like most do (arch_word_32 
dereferences), simply use DEFAULT_ALIGN32, and let the system 'pick'. Or 
if the format does accesse as ARCH_WORD (I think some of the DES formats 
are this way), then use DEFAULT_ALIGN_ARCH_WORD, and again, let the 
system (or build) determine.  The DEFAULT_ALIGN64/128 may only mean 
something on SIMD builds

> BTW nearly all formats introduced since we implemented this got the 
> DEFAULT_ALIGN so the problem has got worse instead of better. Maybe we 
> should set DEFAULT_ALIGN to -1 and have the self-test bitch about any 
> use of it (but this should be muted in releases).

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.