|
Message-ID: <20220905163830.GP1320090@port70.net> Date: Mon, 5 Sep 2022 18:38:30 +0200 From: Szabolcs Nagy <nsz@...t70.net> To: Paul Zimmermann <Paul.Zimmermann@...ia.fr> Cc: dalias@...c.org, musl@...ts.openwall.com Subject: Re: Re: integration of CORE-MATH routines into Musl? * Paul Zimmermann <Paul.Zimmermann@...ia.fr> [2022-09-05 16:39:02 +0200]: > Dear Szabolcs, > > > when i worked on exp and log i noticed that for single prec it is > > easy to do correct rounding with only minor overhead, but it required > > either a bit bigger lookup table or a bit bigger polynomial vs going > > for < 1 ulp error only. > > please have a look at https://gitlab.inria.fr/core-math/core-math/-/blob/master/src/binary32/exp/expf.c: no big lookup table, degree 5 only. "a bit bigger". in this case the polynomial is bigger: order 5 instead of 3. (order 3 is enough for < 1 ulp error). the code size is also bigger: core-math: size -G (x86_64 -O3): text data bss total filename 464 352 0 816 exp2/exp2f.o 398 348 0 746 exp/expf.o musl: size -G: (data is shared between expf, exp2f and powf) text data bss total filename 0 328 0 328 exp2f_data.o 202 12 0 214 exp2f.o 211 16 0 227 expf.o i'd expect at least a bit of overhead between <1 ulp and cr functions (but not significant overhead in case of binary32). so when core-math is faster, it should be possible to write an even faster version that only aims to be <1 ulp (but the perf diff will not be huge). in case of binary64: i'd expect one can turn a close to 0.5 ulp implementation into a cr one with small overhead by testing for near halfway cases in the end and having a slow path for those. but the slow path will be much slower and bigger (and harder to test).
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.