Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 6 Sep 2015 12:02:07 -0500
From: JimF <>
Subject: Re: Large stack alignment

I fixed things here.  We still must use mem_align(), but now it is a 
define macro, using no writes to memory variables to do this 'magic'.  
Just simply pointer readjustment

Here is the macro:

#define mem_align(a,b) 

Now to align, just build a buffer that contains align_wanted extra bytes:

unsigned long _buf[whatever_size + (SIMD_ALIGN)/sizeof(unsigned long)];

then use the macro;
unsigned long *buf = mem_alloc(_buf, SIMD_ALIGN);

buf will be properly aligned somewhere in the range of _buf to 
&((char*)_buf)[0] +SIMD_COEF

On 9/6/2015 9:17 AM, Solar Designer wrote:
> On Sun, Sep 06, 2015 at 03:15:11PM +0200, magnum wrote:
>> On 2015-09-06 13:20, magnum wrote:
>>> So Lei reminded me of this:
>>> We have an issue for changing all stack allocs using MEM_ALIGN_SIMD to
>>> align ourselves.
>> This is fixed in 1bd8d9d.
> I think we (including me) still don't have adequate understanding of
> the problem.  This commit fixes the instances where we were using gcc's
> alignment attributes, but it does not touch explicit vector variables on
> the stack such as SIMDmd5body()'s:
> 	vtype w[16*SIMD_PARA_MD5];
> 	vtype a[SIMD_PARA_MD5];
> 	vtype b[SIMD_PARA_MD5];
> 	vtype c[SIMD_PARA_MD5];
> 	vtype d[SIMD_PARA_MD5];
> 	vtype tmp[SIMD_PARA_MD5];
> 	vtype tmp2[SIMD_PARA_MD5];
> 	vtype mask;
> We're hoping these will be in registers, but when not will they be
> properly aligned for AVX2?  I guess this is just as dependent on stack
> pointer alignment and gcc's capabilities as uses of gcc's alignment
> attribute were.  (And if so, the 1bd8d9d commit makes little sense.)
> Also, when gcc spills AVX2 registers to stack, does it ensure proper
> alignment?  Or does it use unaligned-capable instructions?  Or neither?
> We need to figure all of this out.
> A related issue are library callbacks when our code is called by a
> library that was compiled with a smaller -mpreferred-stack-boundary or
> with non-gcc (and for an ABI permitting a smaller stack alignment than
> we need).  This may be the cause of some of the problems, such as what
> Lei and Jim saw with OpenMP (the threads are started via OpenMP runtime
> library), even when the program initially started with proper stack
> alignment (which is also dependent not only on gcc, but on dynamic
> linker and libc startup code).
> We might want to consider reading up on and using -mstackrealign or/and
> -mincoming-stack-boundary.  And maybe we'd be able to revert the
> explicit alignment commits then, which as I suggested above are likely
> not a complete solution anyway.  I expect that forcing gcc to realign
> the stack would have performance impact, though.  And then there are
> non-gcc compilers.
> It's a mess.
> Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.