|
Message-Id: <08B798D4-6A4A-4A65-9EC1-AA0BAE0961DB@gmail.com> Date: Wed, 10 Jun 2015 23:59:05 +0800 From: Lei Zhang <zhanglei.april@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Interleaving of intrinsics > On Jun 9, 2015, at 8:46 PM, Lei Zhang <zhanglei.april@...il.com> wrote: > > I tried to see the 'size' of sse-intrinsics.o under different interleaving factors and compiled by clang and icc respectively. > > lei-mac:src lei$ size clang/* > __TEXT __DATA __OBJC others dec hex > 122863 0 0 26572 149435 247bb clang/x1.o > 127951 0 0 28699 156650 263ea clang/x2.o > 128479 0 0 28614 157093 265a5 clang/x3.o > 127679 0 0 28527 156206 2622e clang/x4.o > > lei-mac:src lei$ size icc/* > __TEXT __DATA __OBJC others dec hex > 102084 7545 0 50442 160071 27147 icc/x1.o > 113012 9799 0 49375 172186 2a09a icc/x2.o > 113348 9799 0 51275 174422 2a956 icc/x3.o > 114740 9799 0 53235 177774 2b66e icc/x4.o I further did some investigation into the asm code generated under x1 & x2 (SIMD_PARA_SHA256) by icc on my laptop (AVX). In SSESHA256body, there're about 200 vmovdqu instructions generated under x1, and the number is 260 under x2. Most of the vmovdqu instructions seem to be used for loading & storing xmm registers, only a few for inter-register moving. I think it's likely those additional vmovdqu instructions under x2 are for register spilling. Lei
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.