Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK1hOcOaN4SnpO2jMGib3tFEf+c8=Tu8Nwi2YnOhzefpSSqTng@mail.gmail.com>
Date: Tue, 17 Feb 2015 17:51:11 +0100
From: Denys Vlasenko <vda.linux@...glemail.com>
To: Rich Felker <dalias@...c.org>
Cc: musl <musl@...ts.openwall.com>
Subject: Re: [PATCH] x86_64/memset: use "small block" code for blocks
 up to 30 bytes long

On Tue, Feb 17, 2015 at 5:12 PM, Rich Felker <dalias@...c.org> wrote:
> On Tue, Feb 17, 2015 at 02:08:52PM +0100, Denys Vlasenko wrote:
>> >> Please see attached file.
>> >
>> > I tried it and it's ~1 cycle slower for at least sizes 16-30;
>> > presumably we're seeing the cost of the extra compare/branch at these
>> > sizes but not at others. What does your timing test show?
>>
>> See below.
>> First column - result of my2.s
>> Second column - result of vda1.s
>>
>> Basically, the "rep stosq" code path got a bit faster, while
>> small memsets stayed the same.
>
> Can you post your test program for me to try out? Here's what I've
> been using, attached.

With your program I see similar results:

...
size 50: min=10, avg=10           min=10, avg=10
size 52: min=10, avg=10           min=10, avg=10
size 54: min=10, avg=11           min=10, avg=11
size 56: min=10, avg=11           min=10, avg=11
size 58: min=10, avg=11           min=10, avg=10
size 60: min=10, avg=10           min=10, avg=12
size 62: min=10, avg=10           min=10, avg=11
size 64: min=18, avg=18           min=18, avg=22
size 96: min=17, avg=17           min=18, avg=18
size 128: min=31, avg=32          min=32, avg=32
size 160: min=35, avg=37          min=33, avg=37
size 192: min=40, avg=40          min=36, avg=37
size 224: min=43, avg=43          min=40, avg=40
size 256: min=44, avg=47          min=43, avg=43
size 288: min=47, avg=48          min=46, avg=47
size 320: min=50, avg=52          min=52, avg=52
size 352: min=53, avg=54          min=52, avg=60
size 384: min=56, avg=57          min=55, avg=57
size 416: min=59, avg=60          min=62, avg=63
size 448: min=63, avg=65          min=66, avg=66
size 480: min=66, avg=71          min=69, avg=69
size 512: min=73, avg=74          min=73, avg=76
size 1024: min=127, avg=129       min=127, avg=129
size 2048: min=221, avg=236       min=221, avg=236
size 4096: min=425, avg=444       min=424, avg=450
size 8192: min=831, avg=881       min=830, avg=883
size 16384: min=1644, avg=1717    min=1643, avg=1748

My test program is attached, I use:

gcc -O2 -Wall memset-cycles.c FOO.s

View attachment "t.c" of type "text/x-csrc" (3388 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.