Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230211093936.46b9a2f044052552be38cdb2@zhasha.com>
Date: Sat, 11 Feb 2023 09:39:36 +0100
From: Joakim Sindholt <opensource@...sha.com>
To: musl@...ts.openwall.com
Subject: Re: Re:Re: Re:Re: Re:Re: Re:Re: 
 qsort

On Sat, 11 Feb 2023 06:44:29 +0100, "alice" <alice@...ya.dev> wrote:
> based on the glibc profiling, glibc also has their natively-loaded-cpu-specific
> optimisations, the _avx_ functions in your case. musl doesn't implement any
> SIMD optimisations, so this is a bit apples-to-oranges unless musl implements
> the same kind of native per-arch optimisation.
> 
> you should rerun these with GLIBC_TUNABLES, from something in:
> https://www.gnu.org/software/libc/manual/html_node/Hardware-Capability-Tunables.html
> which should let you disable them all (if you just want to compare C to C code).
> 
> ( unrelated, but has there been some historic discussion of implementing
>   something similar in musl? i feel like i might be forgetting something. )

There already are arch-specific asm implementations of functions like
memcpy. As I see it there are 3 issues standing between musl and the
glibc approach of writing a new function every time Intel or AMD
releases a new core design:
1) ifunc resolvers don't work on statically linked binaries.
2) If they did it would mean shipping 12 different implementations of
   each optimized function, making the binary huge for, for the most
   part, no good reason.
3) The esoteric bug is no longer in memcpy but in either memcpy_c,
   memcpy_mmx, memcpy_3dnow, memcpy_sse2, memcpy_sse3, memcpy_ssse3,
   memcpy_sse41, memcpy_sse42, memcpy_avx, memcpy_avx2, memcpy_avx512,
   or memcpy_amx or whatever else is added in the future in a
   never-ending spiral of implementations piling up.

It is my opinion that musl should remain small and concise to allow it
to effectively serve both the "small" and "gotta go fast" markets. I say
both because you can always haul in libreallyreallyfastsort.a/so but you
can't take the 47 qsort/memcpy implementations out of libc.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.