1. 24 5月, 2013 2 次提交
    • R
      change underlying type of clock_t to be uniform and match ABI · 5e642b5a
      Rich Felker 提交于
      previously we were using an unsigned type on 32-bit systems so that
      subtraction would be well-defined when it wrapped, but since wrapping
      is non-conforming anyway (when clock() overflows, it has to return -1)
      the only use of unsigned would be to buy a little bit more time before
      overflow. this does not seem worth having the type vary per-arch
      (which leads to more arch-specific bugs) or disagree with the ABI musl
      (mostly) follows.
      5e642b5a
    • R
      fix overflow behavior of clock() function · 05453b37
      Rich Felker 提交于
      per Austin Group interpretation for issue #686, which cites the
      requirements of ISO C, clock() cannot wrap. if the result is not
      representable, it must return (clock_t)-1. in addition, the old code
      was performing wrapping via signed overflow and thus invoking
      undefined behavior.
      
      since it seems impossible to accurately check for overflow with the
      old times()-based fallback code, I have simply dropped the fallback
      code for now, thus always returning -1 on ancient systems. if there's
      a demand for making it work and somebody comes up with a way, it could
      be reinstated, but the clock() function is essentially useless on
      32-bit system anyway (it overflows in less than an hour).
      
      it should be noted that I used LONG_MAX rather than ULONG_MAX, despite
      32-bit archs using an unsigned type for clock_t. this discrepency with
      the glibc/LSB type definitions will be fixed now; since wrapping of
      clock_t is no longer supported, there's no use in it being unsigned.
      05453b37
  2. 19 5月, 2013 3 次提交
    • S
      math: add fma TODO comments about the underflow issue · 1e5eb735
      Szabolcs Nagy 提交于
      The underflow exception is not raised correctly in some
      cornercases (see previous fma commit), added comments
      with examples for fmaf, fmal and non-x86 fma.
      
      In fmaf store the result before returning so it has the
      correct precision when FLT_EVAL_METHOD!=0
      1e5eb735
    • S
      math: fix two fma issues (only affects non-nearest rounding mode, x86) · ffd8ac2d
      Szabolcs Nagy 提交于
      1) in downward rounding fma(1,1,-1) should be -0 but it was 0 with
      gcc, the code was correct but gcc does not support FENV_ACCESS ON
      so it used common subexpression elimination where it shouldn't have.
      now volatile memory access is used as a barrier after fesetround.
      
      2) in directed rounding modes there is no double rounding issue
      so the complicated adjustments done for nearest rounding mode are
      not needed. the only exception to this rule is raising the underflow
      flag: assume "small" is an exactly representible subnormal value in
      double precision and "verysmall" is a much smaller value so that
      	(long double)(small plus verysmall) == small
      then
      	(double)(small plus verysmall)
      raises underflow because the result is an inexact subnormal, but
      	(double)(long double)(small plus verysmall)
      does not because small is not a subnormal in long double precision
      and it is exact in double precision.
      now this problem is fixed by checking inexact using fenv when the
      result is subnormal
      ffd8ac2d
    • R
      Merge remote-tracking branch 'nsz/review' · 83af1dd6
      Rich Felker 提交于
      83af1dd6
  3. 18 5月, 2013 5 次提交
    • S
      math: sin cos cleanup · bfda3793
      Szabolcs Nagy 提交于
      * use unsigned arithmetics
      * use unsigned to store arg reduction quotient (so n&3 is understood)
      * remove z=0.0 variables, use literal 0
      * raise underflow and inexact exceptions properly when x is small
      * fix spurious underflow in tanl
      bfda3793
    • R
      make err.h functions print __progname · 69ee9b2c
      Rich Felker 提交于
      patch by Strake. previously is was not feasible to duplicate this
      functionality of the functions these were modeled on, since argv[0]
      was not saved at program startup, but now that it's available it's
      easy to use.
      69ee9b2c
    • S
      math: tan cleanups · 1d5ba3bb
      Szabolcs Nagy 提交于
      * use unsigned arithmetics on the representation
      * store arg reduction quotient in unsigned (so n%2 would work like n&1)
      * use different convention to pass the arg reduction bit to __tan
        (this argument used to be 1 for even and -1 for odd reduction
        which meant obscure bithacks, the new n&1 is cleaner)
      * raise inexact and underflow flags correctly for small x
        (tanl(x) may still raise spurious underflow for small but normal x)
        (this exception raising code increases codesize a bit, similar fixes
        are needed in many other places, it may worth investigating at some
        point if the inexact and underflow flags are worth raising correctly
        as this is not strictly required by the standard)
      * tanf manual reduction optimization is kept for now
      * tanl code path is cleaned up to follow similar logic to tan and tanf
      1d5ba3bb
    • R
      add FLT_TRUE_MIN, etc. macros from C11 · 22730d65
      Rich Felker 提交于
      there was some question as to how many decimal places to use, since
      one decimal place is always sufficient to identify the smallest
      denormal uniquely. for now, I'm following the example in the C
      standard which is consistent with the other min/max macros we already
      had in place.
      22730d65
    • R
      remove the __STDC_FORMAT_MACROS nonsense from inttypes.h · ec9f5353
      Rich Felker 提交于
      somehow I missed this when removing the corresponding
      __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS nonsense from stdint.h.
      these were all attempts by the C committee to guess what the C++
      committee would want, and the guesses turned out to be wrong.
      ec9f5353
  4. 17 5月, 2013 1 次提交
    • R
      fix mknod and mknodat to accept large dev_t values · 1a70198b
      Rich Felker 提交于
      support for these was recently added to sysmacros.h. note that the
      syscall argument is a long, despite dev_t being 64-bit, so on 32-bit
      archs the high bits will be lost. it appears the high bits are just
      glibc silliness and not part of the kernel api, anyway, but it's nice
      that we have them there for future expansion if needed.
      1a70198b
  5. 16 5月, 2013 2 次提交
    • S
      math: use double_t for temporaries to avoid stores on i386 · e216951f
      Szabolcs Nagy 提交于
      When FLT_EVAL_METHOD!=0 (only i386 with x87 fp) the excess
      precision of an expression must be removed in an assignment.
      (gcc needs -fexcess-precision=standard or -std=c99 for this)
      
      This is done by extra load/store instructions which adds code
      bloat when lot of temporaries are used and it makes the result
      less precise in many cases.
      Using double_t and float_t avoids these issues on i386 and
      it makes no difference on other archs.
      
      For now only a few functions are modified where the excess
      precision is clearly beneficial (mostly polynomial evaluations
      with temporaries).
      
      object size differences on i386, gcc-4.8:
                   old   new
      __cosdf.o    123    95
      __cos.o      199   169
      __sindf.o    131    95
      __sin.o      225   203
      __tandf.o    207   151
      __tan.o      605   499
      erff.o      1470  1416
      erf.o       1703  1649
      j0f.o       1779  1745
      j0.o        2308  2274
      j1f.o       1602  1568
      j1.o        2286  2252
      tgamma.o    1431  1424
      math/*.o   64164 63635
      e216951f
    • R
  6. 07 5月, 2013 1 次提交
    • S
      remove compound literals from math.h to please c++ · 2897bfdd
      Szabolcs Nagy 提交于
      __FLOAT_BITS and __DOUBLE_BITS macros used union compound literals,
      now they are changed into static inline functions. A good C compiler
      generates the same code for both and the later is C++ conformant.
      2897bfdd
  7. 06 5月, 2013 2 次提交
    • R
      fix incorrect clock tick scaling in fallback case of clock() · da49b872
      Rich Felker 提交于
      since CLOCKS_PER_SEC is 1000000 (required by XSI) and the times
      syscall reports values in 1/100 second units (Linux), the correct
      scaling factor is 10000, not 100. note that only ancient kernels which
      lack clock_gettime are affected.
      da49b872
    • R
      do not interpret errors in return value of times() syscall · 9293b765
      Rich Felker 提交于
      all return values are valid, and on 32-bit systems, values that look
      like errors can and will occur. since the only actual error this
      function could return is EFAULT, and it is only returnable when the
      application has invoked undefined behavior, simply ignore the
      possibility that the return value is actually an error code.
      9293b765
  8. 27 4月, 2013 10 次提交
    • R
      transition to using functions for internal signal blocking/restoring · 2c074b0d
      Rich Felker 提交于
      there are several reasons for this change. one is getting rid of the
      repetition of the syscall signature all over the place. another is
      sharing the constant masks without costly GOT accesses in PIC.
      
      the main motivation, however, is accurately representing whether we
      want to block signals that might be handled by the application, or all
      signals.
      2c074b0d
    • R
      optimize/debloat raise · d53c92c9
      Rich Felker 提交于
      use __syscall rather than syscall when failure is not possible or not
      to be considered.
      d53c92c9
    • R
    • R
      synccall signal handler need not handle dead threads anymore · 47d2bf51
      Rich Felker 提交于
      they have already blocked signals before decrementing the thread
      count, so the code being removed is unreachable in the case where the
      thread is no longer counted.
      47d2bf51
    • R
      fix clobbering of signal mask when creating thread with sched attributes · 082fb4e9
      Rich Felker 提交于
      this was simply a case of saving the state in the wrong place.
      082fb4e9
    • R
      make last thread's pthread_exit give exit(0) a consistent state · d0ba0983
      Rich Felker 提交于
      the previous few commits ended up leaving the thread count and signal
      mask wrong for atexit handlers and stdio cleanup.
      d0ba0983
    • R
      use atomic decrement rather than cas in pthread_exit thread count · c3a6839c
      Rich Felker 提交于
      now that blocking signals prevents any application code from running
      while the last thread is exiting, the cas logic is no longer needed to
      prevent decrementing below zero.
      c3a6839c
    • R
      add comments on some of the pthread_exit logic · 6e531f99
      Rich Felker 提交于
      6e531f99
    • R
      always block signals in pthread_exit before decrementing thread count · 23f21c30
      Rich Felker 提交于
      the thread count (1+libc.threads_minus_1) must always be greater than
      or equal to the number of threads which could have application code
      running, even in an async-signal-safe sense. there is at least one
      dangerous race condition if this invariant fails to hold: dlopen could
      allocate too little TLS for existing threads, and a signal handler
      running in the exiting thread could claim the allocated TLS for itself
      (via __tls_get_addr), leaving too little for the other threads it was
      allocated for and thereby causing out-of-bounds access.
      
      there may be other situations where it's dangerous for the thread
      count to be too low, particularly in the case where only one thread
      should be left, in which case locking may be omitted. however, all
      such code paths seem to arise from undefined behavior, since
      async-signal-unsafe functions are not permitted to be called from a
      signal handler that interrupts pthread_exit (which is itself
      async-signal-unsafe).
      
      this change may also simplify logic in __synccall and improve the
      chances of making __synccall async-signal-safe.
      23f21c30
    • R
      remove explicit locking to prevent __synccall setuid during posix_spawn · a0473a0c
      Rich Felker 提交于
      for the duration of the vm-sharing clone used by posix_spawn, all
      signals are blocked in the parent process, including
      implementation-internal signals. since __synccall cannot do anything
      until successfully signaling all threads, the fact that signals are
      blocked automatically yields the necessary safety.
      
      aside from debloating and general simplification, part of the
      motivation for removing the explicit lock is to simplify the
      synchronization logic of __synccall in hopes that it can be made
      async-signal-safe, which is needed to make setuid and setgid, which
      depend on __synccall, conform to the standard. whether this will be
      possible remains to be seen.
      a0473a0c
  9. 23 4月, 2013 1 次提交
  10. 22 4月, 2013 1 次提交
  11. 21 4月, 2013 4 次提交
  12. 20 4月, 2013 1 次提交
    • R
      make dynamic linker accept : or \n as path separator · 8c203eae
      Rich Felker 提交于
      this allows /etc/ld-musl-$(ARCH).path to contain one path per line,
      which is much more convenient for users than the :-delimited format,
      which was a source of repeated and unnecessary confusion. for
      simplicity, \n is also accepted in environment variables, though it
      should probably not be used there.
      
      at the same time, issues with overly long paths invoking UB or getting
      truncated have been fixed. such issues should not have arisen with the
      environment (which is size-limited) but could have been generated by a
      path file larger than 2**31 bytes in length.
      8c203eae
  13. 14 4月, 2013 1 次提交
  14. 11 4月, 2013 1 次提交
    • R
      make ifaddrs.h expose sys/socket.h · 4ba3ebdc
      Rich Felker 提交于
      the getifaddrs interface seems to have been invented by glibc, and
      they expose socket.h, so for us not to do so is just gratuitous
      incompatibility with the interface we're mimicing.
      4ba3ebdc
  15. 09 4月, 2013 5 次提交
    • R
      getifaddrs: implement proper ipv6 netmasks · 9947ed5c
      rofl0r 提交于
      9947ed5c
    • R
      mbrtowc: do not leave mbstate_t in permanent-fail state after EILSEQ · 23ab8c25
      Rich Felker 提交于
      the standard is clear that the old behavior is conforming: "In this
      case, [EILSEQ] shall be stored in errno and the conversion state is
      undefined."
      
      however, the specification of mbrtowc has one peculiarity when the
      source argument is a null pointer: in this case, it's required to
      behave as mbrtowc(NULL, "", 1, ps). no motivation is provided for this
      requirement, but the natural one that comes to mind is that the intent
      is to reset the mbstate_t object. for stateful encodings, such
      behavior is actually specified: "If the corresponding wide character
      is the null wide character, the resulting state described shall be the
      initial conversion state." but in the case of UTF-8 where the
      mbstate_t object contains a partially-decoded character rather than a
      shift state, a subsequent '\0' byte indicates that the previous
      partial character is incomplete and thus an illegal sequence.
      
      naturally, applications using their own mbstate_t object should clear
      it themselves after an error, but the standard presently provides no
      way to clear the builtin mbstate_t object used when the ps argument is
      a null pointer. I suspect this issue may be addressed in the future by
      specifying that a null source argument resets the state, as this seems
      to have been the intent all along.
      
      for what it's worth, this change also slightly reduces code size.
      23ab8c25
    • R
      implement mbtowc directly, not as a wrapper for mbrtowc · ea34b1b9
      Rich Felker 提交于
      the interface contract for mbtowc admits a much faster implementation
      than mbrtowc can achieve; wrapping mbrtowc with an extra call frame
      only made the situation worse.
      
      since the regex implementation uses mbtowc already, this change should
      improve regex performance too. it may be possible to improve
      performance in other places internally by switching from mbrtowc to
      mbtowc.
      ea34b1b9
    • R
      optimize mbrtowc · a49e038b
      Rich Felker 提交于
      this simple change, in my measurements, makes about a 7% performance
      improvement. at first glance this change would seem like a
      compiler-specific hack, since the modified code is not even used.
      however, I suspect the reason is that I'm eliminating a second path
      into the main body of the code, allowing the compiler more flexibility
      to optimize the normal (hot) path into the main body. so even if it
      weren't for the measurable (and quite notable) difference in
      performance, I think the change makes sense.
      a49e038b
    • R
      fix out-of-bounds access in UTF-8 decoding · 8f06ab0e
      Rich Felker 提交于
      SA and SB are used as the lowest and highest valid starter bytes, but
      the value of SB was one-past the last valid starter. this caused
      access past the end of the state table when the illegal byte '\xf5'
      was encountered in a starter position. the error did not show up in
      full-character decoding tests, since the bogus state read from just
      past the table was unlikely to admit any continuation bytes as valid,
      but would have shown up had we tested feeding '\xf5' to the
      byte-at-a-time decoding in mbrtowc: it would cause the funtion to
      wrongly return -2 rather than -1.
      
      I may eventually go back and remove all references to SA and SB,
      replacing them with the values; this would make the code more
      transparent, I think. the original motivation for using macros was to
      allow misguided users of the code to redefine them for the purpose of
      enlarging the set of accepted sequences past the end of Unicode...
      8f06ab0e