Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CF1B278A1B1D40F68C8C33CC16CC5E8D@H270>
Date: Sat, 7 Aug 2021 15:12:14 +0200
From: "Stefan Kanthak" <stefan.kanthak@...go.de>
To: "Rich Felker" <dalias@...c.org>
Cc: "Alexander Monakov" <amonakov@...ras.ru>,
	"Szabolcs Nagy" <nsz@...t70.net>,
	<musl@...ts.openwall.com>
Subject: Re: [Patch] src/math/i386/remquo.s: remove conditional branch, shorter bit twiddling

Rich Felker <dalias@...c.org> wrote:


> On Fri, Aug 06, 2021 at 07:23:19PM +0200, Stefan Kanthak wrote:
>> Rich Felker <dalias@...c.org> wrote:

>> > The path forward for all the math asm is moving it to inline asm in C
>> > files, with no flow control or bit/register shuffling in the asm, only
>> > using asm for the single instructions.

I'd even go a step further and put all those short functions into header
files to give the compiler a chance to properly them in context.
And then get rid of all the __builtin_* bloat GCC unfortunately carries
with it...

>> > I haven't read the mul trick here in detail but I believe it should be
>> > duplicable with plain C * operator.
>> 
>> It is.
> 
> Great! Does it improve the code generation?

Of course: the same 3 instructions Alexander showed here:

remquol:
        fldt    16(%esp)
        fldt    4(%esp)
.L2:
#APP
# 19 "remquol.c" 1
        fprem1; fnstsw %ax
# 0 "" 2
#NO_APP
        testb   $4, %ah
        jne     .L2
        fstp    %st(1)
        andl    $17152, %eax         #
        movzbl  15(%esp), %ecx       # this should be movb 15(%esp), %cl
        imull   $9502720, %eax, %eax #
        shrl    $29, %eax            #
        movl    %eax, %edx
        negl    %edx
        xorb    27(%esp), %cl
        cmovs   %edx, %eax
        movl    28(%esp), %edx
        movl    %eax, (%edx)
        ret

The code to flip the sign is but not as short as in the assembly version
I posted; the following change may give shorter code:

    int sign = (*cx^*cy) >> 7;
    qbits += sign;
    qbits ^= sign;

>> > I really do not want to review/merge asm changes that keep this kind
>> > of complex logic in asm when there's no strong motivation for it (like
>> > fixing an actual bug, vs just reducing size or improving speed). The
>> > risk to reward ratio is just not reasonable.
>> 
>> Final patch attached!
> 
> Thanks. I don't mind making any final tweaks needed, if necessary.

Do so as you will and wish: I don't mind^W^Wappreciate if someone can
improve code I've written.

Stefan

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.