1. 15 12月, 2012 1 次提交
    • S
      math: fix i386/expl.s with more precise x*log2e · a8f73bb1
      Szabolcs Nagy 提交于
      with naive exp2l(x*log2e) the last 12bits of the result was incorrect
      for x with large absolute value
      
      with hi + lo = x*log2e is caluclated to 128 bits precision and then
        expl(x) = exp2l(hi) + exp2l(hi) * f2xm1(lo)
      this gives <1.5ulp measured error everywhere in nearest rounding mode
      a8f73bb1
  2. 12 12月, 2012 1 次提交
  3. 09 8月, 2012 1 次提交
  4. 08 5月, 2012 1 次提交
  5. 05 5月, 2012 1 次提交
    • N
      math: change the formula used for acos.s · f697d66b
      nsz 提交于
      old: 2*atan2(sqrt(1-x),sqrt(1+x))
      new: atan2(fabs(sqrt((1-x)*(1+x))),x)
      improvements:
      * all edge cases are fixed (sign of zero in downward rounding)
      * a bit faster (here a single call is about 131ns vs 162ns)
      * a bit more precise (at most 1ulp error on 1M uniform random
      samples in [0,1), the old formula gave some 2ulp errors as well)
      f697d66b
  6. 04 4月, 2012 1 次提交
    • N
      math: fix x86 asin accuracy · 37eaec3a
      nsz 提交于
      use (1-x)*(1+x) instead of (1-x*x) in asin.s
      the later can be inaccurate with upward rounding when x is close to 1
      37eaec3a
  7. 29 3月, 2012 1 次提交
    • N
      math: remove x86 modf asm · d79ac8c3
      nsz 提交于
      the int part was wrong when -1 < x <= -0 (+0.0 instead of -0.0)
      and the size and performace gain of the asm version was negligible
      d79ac8c3
  8. 28 3月, 2012 1 次提交
  9. 23 3月, 2012 1 次提交
    • R
      asm for hypot and hypotf · ad2d2b96
      Rich Felker 提交于
      special care is made to avoid any inexact computations when either arg
      is zero (in which case the exact absolute value of the other arg
      should be returned) and to support the special condition that
      hypot(±inf,nan) yields inf.
      
      hypotl is not yet implemented since avoiding overflow is nontrivial.
      ad2d2b96
  10. 22 3月, 2012 1 次提交
  11. 20 3月, 2012 5 次提交
    • R
      optimize scalbn family · baa43bca
      Rich Felker 提交于
      the fscale instruction is slow everywhere, probably because it
      involves a costly and unnecessary integer truncation operation that
      ends up being a no-op in common usages. instead, construct a floating
      point scale value with integer arithmetic and simply multiply by it,
      when possible.
      
      for float and double, this is always possible by going to the
      next-larger type. we use some cheap but effective saturating
      arithmetic tricks to make sure even very large-magnitude exponents
      fit. for long double, if the scaling exponent is too large to fit in
      the exponent of a long double value, we simply fallback to the
      expensive fscale method.
      
      on atom cpu, these changes speed up scalbn by over 30%. (min rdtsc
      timing dropped from 110 cycles to 70 cycles.)
      baa43bca
    • R
      remquo asm: return quotient mod 8, as intended by the spec · 7513d3ec
      Rich Felker 提交于
      this is a lot more efficient and also what is generally wanted.
      perhaps the bit shuffling could be more efficient...
      7513d3ec
    • R
      804fbf0b
    • R
      fix exp asm · acb74492
      Rich Felker 提交于
      exponents (base 2) near 16383 were broken due to (1) wrong cutoff, and
      (2) inability to fit the necessary range of scalings into a long
      double value.
      
      as a solution, we fall back to using frndint/fscale for insanely large
      exponents, and also have to special-case infinities here to avoid
      inf-inf generating nan.
      
      thankfully the costly code never runs in normal usage cases.
      acb74492
    • R
      bug fix: wrong opcode for writing long long · d9c1d72c
      Rich Felker 提交于
      d9c1d72c
  12. 19 3月, 2012 15 次提交
  13. 16 3月, 2012 2 次提交
  14. 15 3月, 2012 2 次提交
    • R
    • R
      correctly rounded sqrt() asm for x86 (i387) · 809556e6
      Rich Felker 提交于
      the fsqrt opcode is correctly rounded, but only in the fpu's selected
      precision mode, which is 80-bit extended precision. to get a correctly
      rounded double precision output, we check for the only corner cases
      where two-step rounding could give different results than one-step
      (extended-precision mantissa ending in 0x400) and adjust the mantissa
      slightly in the opposite direction of the rounding which the fpu
      already did (reported in the c1 flag of the fpu status word).
      
      this should have near-zero cost in the non-corner cases and at worst
      very low cost.
      
      note that in order for sqrt() to get used when compiling with gcc, the
      broken, non-conformant builtin sqrt must be disabled.
      809556e6
  15. 14 3月, 2012 1 次提交
  16. 13 3月, 2012 1 次提交
    • R
      first commit of the new libm! · b69f695a
      Rich Felker 提交于
      thanks to the hard work of Szabolcs Nagy (nsz), identifying the best
      (from correctness and license standpoint) implementations from freebsd
      and openbsd and cleaning them up! musl should now fully support c99
      float and long double math functions, and has near-complete complex
      math support. tgmath should also work (fully on gcc-compatible
      compilers, and mostly on any c99 compiler).
      
      based largely on commit 0376d44a890fea261506f1fc63833e7a686dca19 from
      nsz's libm git repo, with some additions (dummy versions of a few
      missing long double complex functions, etc.) by me.
      
      various cleanups still need to be made, including re-adding (if
      they're correct) some asm functions that were dropped.
      b69f695a
  17. 27 6月, 2011 1 次提交
  18. 12 2月, 2011 1 次提交