Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zh8sNfTmTwtR4Sn1@voyager>
Date: Wed, 17 Apr 2024 03:56:05 +0200
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Cc: Viktor Reznov <yann.collet.is.not.a.perfectionist@...il.com>
Subject: Re: [PATCH] Decreasing the number of divisions

Am Wed, Apr 17, 2024 at 01:25:18AM +0000 schrieb NRK:
> > I played around with this change on godbolt: https://godbolt.org/z/9PoGK9zae
>
> You're looking at clang -O3, if you use gcc -Os (usual for musl
> users/distros) you'll notice that gcc actually ends up emitting a div
> instruction, which are known to be slow.
>
> But I don't think trying to optimize around gcc's bad codegen is the
> right move. It's better to just not use -Os with gcc. Which musl already
> does since commit b90841e25832.
>
> - NRK

Well, yeah, if you use -Os and expect fast code, you are doing it wrong.
-Os explicitly asks for small code rather than fast. It is appropriate
for code that ends up having to fit in a ROM or a bootsector or
something, but otherwise I can't really see the point.

I remember once reading the insane claim that -Os code ends up being
faster than -O3 because the code fits in cache, but much as the OP in
this thread, there was no benchmark to actually show this. I call it
insane because -O3 is telling GCC explicitly to output the fastest code
possible, and in my experience it generally does.

Ciao,
Markus

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.