1. 16 5月, 2013 1 次提交
    • S
      math: use double_t for temporaries to avoid stores on i386 · e216951f
      Szabolcs Nagy 提交于
      When FLT_EVAL_METHOD!=0 (only i386 with x87 fp) the excess
      precision of an expression must be removed in an assignment.
      (gcc needs -fexcess-precision=standard or -std=c99 for this)
      
      This is done by extra load/store instructions which adds code
      bloat when lot of temporaries are used and it makes the result
      less precise in many cases.
      Using double_t and float_t avoids these issues on i386 and
      it makes no difference on other archs.
      
      For now only a few functions are modified where the excess
      precision is clearly beneficial (mostly polynomial evaluations
      with temporaries).
      
      object size differences on i386, gcc-4.8:
                   old   new
      __cosdf.o    123    95
      __cos.o      199   169
      __sindf.o    131    95
      __sin.o      225   203
      __tandf.o    207   151
      __tan.o      605   499
      erff.o      1470  1416
      erf.o       1703  1649
      j0f.o       1779  1745
      j0.o        2308  2274
      j1f.o       1602  1568
      j1.o        2286  2252
      tgamma.o    1431  1424
      math/*.o   64164 63635
      e216951f
  2. 12 12月, 2012 1 次提交
    • S
      math: add a non-dummy tgamma implementation · 0f53c1a4
      Szabolcs Nagy 提交于
      uses the lanczos approximation method with the usual tweaks.
      same parameters were selected as in boost and python.
      (avoides some extra work and special casing found in boost
      so the precision is not that good: measured error is <5ulp for
      positive x and <10ulp for negative)
      
      an alternative lgamma_r implementation is also given in the same
      file which is simpler and smaller than the current one, but less
      precise so it's ifdefed out for now.
      0f53c1a4
  3. 28 3月, 2012 1 次提交