similar to exp.c cleanup: use scalbnf, don't return excess precision,
drop some optimizatoins.
exp.c was changed to be more consistent with expf.c code.
overflow and underflow was incorrect when the result was not stored.
an optimization for the 0.5*ln2 < |x| < 1.5*ln2 domain was removed.
did various cleanups around static constants and made the comments
consistent with the code.
old code was correct only if the result was stored (without the
excess precision) or musl was compiled with -ffloat-store.
now we use STRICT_ASSIGN to work around the issue.
(see note 160 in c11 section 6.8.6.4)
old code was correct only if the result was stored (without the
excess precision) or musl was compiled with -ffloat-store.
(see note 160 in n1570.pdf section 6.8.6.4)
this function never existed historically; since the float/double
functions it's based on are nonstandard and deprecated, there's really
no justification for its existence except that glibc has it. it can be
added back if there's ever really a need...
The long double adjustment was wrong:
The usual check is
mant_bits & 0x7ff == 0x400
before doing a mant_bits++ or mant_bits-- adjustment since
this is the only case when rounding an inexact ld80 into
double can go wrong. (only in nearest rounding mode)
After such a check the ++ and -- is ok (the mantissa will end
in 0x401 or 0x3ff).
fma is a bit different (we need to add 3 numbers with correct
rounding: hi_xy + lo_xy + z so we should survive two roundings
at different places without precision loss)
The adjustment in fma only checks for zero low bits
mant_bits & 0x3ff == 0
this way the adjusted value is correct when rounded to
double or *less* precision.
(this is an important piece in the fma puzzle)
Unfortunately in this case the -- is not a correct adjustment
because mant_bits might underflow so further checks are needed
and this was the source of the bug.
apparently initializing a variable is not "using" it but assigning to
it is "using" it. i don't really like this fix, but it's better than
trying to make a bigger cleanup just before a release, and it should
work fine (tested against nsz's math tests).
old: 2*atan2(sqrt(1-x),sqrt(1+x))
new: atan2(fabs(sqrt((1-x)*(1+x))),x)
improvements:
* all edge cases are fixed (sign of zero in downward rounding)
* a bit faster (here a single call is about 131ns vs 162ns)
* a bit more precise (at most 1ulp error on 1M uniform random
samples in [0,1), the old formula gave some 2ulp errors as well)
this is a nonstandard function so it's not clear what conditions it
should satisfy. my intent is that it be fast and exact for positive
integral exponents when the result fits in the destination type, and
fast and correctly rounded for small negative integral exponents.
otherwise we aim for at most 1ulp error; it seems to differ from pow
by at most 1ulp and it's often 2-5 times faster than pow.
special care is made to avoid any inexact computations when either arg
is zero (in which case the exact absolute value of the other arg
should be returned) and to support the special condition that
hypot(±inf,nan) yields inf.
hypotl is not yet implemented since avoiding overflow is nontrivial.