Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 18 Mar 2024 17:15:09 +0100
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: Re: x86 fma with run-time switch?

Am Sat, Mar 16, 2024 at 04:37:29AM +0100 schrieb Markus Wichmann:
> I am also unsure whether you need much more than __hwcap (and possibly
> __hwcap2) for the string functions. I don't know what optimizations you
> have in mind for those, but if you need AVX support, for example, then
> CPUID information does not help you. You need confirmation from the
> kernel that it supports AVX, or else you will catch a SIGILL in the
> attempt to use it. Very few ISA extensions can make do without kernel
> support, after all.
>

*sigh* I had another look at how this all is supposed to work now. So
__hwcap on x86_64 is just the contents of EDX from CPUID function 1.
That's not very helpful at all. There is __hwcap2, which at least
contains a bit that tells us whether wrfsbase is OK to use (which we
might want to integrate into x86_64's version of __set_thread_area()),
but that is it. For AVX and AVX512, the way you are supposed to do it
now is:

1. Test the OSXFSR bit from CPUID
2. If set, use XGETBV to read XCR0 and test the YMM/ZMM bit.

And that only tells you whether you can access the registers. But every
other instruction is actually part of another ISA extension, so now you
need to also check the corresponding CPUID bit for the ISA extension.

Incidentally, even when __hwcap is useful, it doesn't help too much
if the interface is dumb. Because I was also looking up when fsel became
non-optional in PowerPC, and it turns out it was at the same time as
fsqrt: Namely in Power ISA 2.03. (PowerPC 2.02 still lists it as
optional). Right, so we only need to check the v2.03 bit in __hwcap,
right? Nope. Because for one, there is no such bit defined, and for two,
its closest proxy, the POWER5+ bit, is not set on the following
implementations. It took until 2.06 for them to keep the bit around for
good. So you have to test for three different bits. Thankfully not
spilling into __hwcap2.

Ciao,
Markus

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.