|
Message-ID: <3fhhfnntv2x4c9nyvgj2xxwe.1436742719069@email.android.com>
Date: Sun, 12 Jul 2015 19:20:18 -0400
From: Alain Espinosa <alainesp@...ta.cu>
To: john-dev@...ts.openwall.com
Subject: Re: extend SIMD intrinsics
-------- Original message --------
From: Solar Designer <solar@...nwall.com>
Date:07/12/2015 12:52 PM (GMT-05:00)
To: john-dev@...ts.openwall.com
Cc:
Subject: Re: [john-dev] extend SIMD intrinsics
...Right. However, what if you move test_array to global scope, or declare
it "static" in the function? I am wondering if the compiler possibly
doesn't want to rely on the stack being 16-byte aligned (as it normally
is per x86_64 ABI).
I just make some retests. I was generalizing from AVX2 to SSE2, and that was wrong. Take this code:
#define SIMD_WORD __m128i
void test()
{
SIMD_WORD test_array[16];
...
SIMD_WORD test_var=test_array[6];
...
}
-This generate movdqa instructions whenever we put test_array in global or stack scope.
-When we change SIMD_WORD to __m256i the code generate vmovdqu whenever we put test_array in global or stack scope. If we change the assignment to use _mm256_load_si256 it generates vmovdqa instruction.
-When we change SIMD_WORD to __m256 the code generate vmovups whenever we put test_array in global or stack scope.
So the unaligned access is with AVX/AVX2 intrinsics.
Regards,
Alain
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.