1. 10 1月, 2018 4 次提交
    • J
      revise the definition of multiple basic locks in the code · 32482f61
      Jens Gustedt 提交于
      In all cases this is just a change from two volatile int to one.
      32482f61
    • J
      consistently use the LOCK an UNLOCK macros · c4bc0b1a
      Jens Gustedt 提交于
      In some places there has been a direct usage of the functions. Use the
      macros consistently everywhere, such that it might be easier later on to
      capture the fast path directly inside the macro and only have the call
      overhead on the slow path.
      c4bc0b1a
    • J
      new lock algorithm with state and congestion count in one atomic int · 47d0bcd4
      Jens Gustedt 提交于
      A variant of this new lock algorithm has been presented at SAC'16, see
      https://hal.inria.fr/hal-01304108. A full version of that paper is
      available at https://hal.inria.fr/hal-01236734.
      
      The main motivation of this is to improve on the safety of the basic lock
      implementation in musl. This is achieved by squeezing a lock flag and a
      congestion count (= threads inside the critical section) into a single
      int. Thereby an unlock operation does exactly one memory
      transfer (a_fetch_add) and never touches the value again, but still
      detects if a waiter has to be woken up.
      
      This is a fix of a use-after-free bug in pthread_detach that had
      temporarily been patched. Therefore this patch also reverts
      
               c1e27367
      
      This is also the only place where internal knowledge of the lock
      algorithm is used.
      
      The main price for the improved safety is a little bit larger code.
      
      Under high congestion, the scheduling behavior will be different
      compared to the previous algorithm. In that case, a successful
      put-to-sleep may appear out of order compared to the arrival in the
      critical section.
      47d0bcd4
    • H
      add additional uapi guards for Linux kernel header files · b583c5d3
      Hauke Mehrtens 提交于
      With Linux kernel 4.16 it will be possible to guard more parts of the
      Linux header files from a libc. Make use of this in musl to guard all
      the structures and other definitions from the Linux header files which
      are also defined by the header files provided by musl. This will make
      it possible to compile source files which include both the libc
      headers and the kernel userspace headers.
      
      This extends the definitions done in commit 04983f22 ("make
      netinet/in.h suppress clashing definitions from kernel headers")
      b583c5d3
  2. 19 12月, 2017 6 次提交
    • R
      fix iconv output of surrogate pairs in ucs2 · 628cf979
      Rich Felker 提交于
      in the unified code for handling utf-16 and ucs2 output, the check for
      ucs2 wrongly looked at the source charset rather than the destination
      charset.
      628cf979
    • R
      add support for BOM-determined-endian UCS2, UTF-16, and UTF-32 to iconv · 95c6044e
      Rich Felker 提交于
      previously, the charset names without endianness specified were always
      interpreted as big endian. unicode specifies that UTF-16 and UTF-32
      have BOM-determined endianness if BOM is present, and are otherwise
      big endian. since commit 5b546faa
      added support for stateful encodings, it is now possible to implement
      BOM support via the conversion descriptor state.
      
      for conversions to these charsets, the output is always big endian and
      does not have a BOM.
      95c6044e
    • R
      add cp866 (dos cyrillic) to iconv · 9d4d0ee4
      Rich Felker 提交于
      9d4d0ee4
    • R
      update case mappings to unicode 10.0 · 54941edd
      Rich Felker 提交于
      the mapping tables and code are not automatically generated; they were
      produced by comparing the output of towupper/towlower against the
      mappings in the UCD, ignoring characters that were previously excluded
      from case mappings or from alphabetic status (micro sign and circled
      letters), and adding table entries or code for everything else
      missing.
      
      based very loosely on a patch by Reini Urban.
      54941edd
    • R
      update ctype tables to unicode 10.0 · c72c1c52
      Rich Felker 提交于
      c72c1c52
    • R
      reformat ctype tables to be diff-friendly, match tool output · d3f23337
      Rich Felker 提交于
      the new version of the code used to generate these tables forces a
      newline every 256 entries, whereas at the time these files were
      originally generated and committed, it only wrapped them at 80
      columns. the new behavior ensures that localized changes to the
      tables, if they are ever needed, will produce localized diffs.
      
      commit d060edf6 made the corresponding
      changes to the iconv tables.
      d3f23337
  3. 16 12月, 2017 1 次提交
  4. 15 12月, 2017 6 次提交
    • J
    • N
      remove unused explicit dependency rules for crti/crtn · 2a831786
      Nicholas Wilson 提交于
      notes by maintainer:
      
      commit 2f853dd6 added these rules
      because the new system for handling arch-provided replacement files
      introduced for out-of-tree builds did not apply to the crt tree.
      
      commit 63bcda4d later adapted the
      makefile logic so that the crt and ldso trees go through the same
      replacement logic as everything else, but failed to remove the
      explicit rules that assumed the arch would always provide asm
      replacements.
      
      in addition to cleaning things up, removing these spurious rules
      allows crti/crtn asm to be omitted by an arch (thereby using the empty
      C files instead) if they are not needed.
      2a831786
    • N
      use the name UTC instead of GMT for UTC timezone · eb7f93c4
      Natanael Copa 提交于
      notes by maintainer:
      
      both C and POSIX use the term UTC to specify related functionality,
      despite POSIX defining it as something more like UT1 or historical
      (pre-UTC) GMT without leap seconds. neither specifies the associated
      string for %Z. old choice of "GMT" violated principle of least
      surprise for users and some applications/tests. use "UTC" instead.
      eb7f93c4
    • N
      fix sysconf for infinite rlimits · 3ec82877
      Natanael Copa 提交于
      sysconf should return -1 for infinity, not LONG_MAX.
      3ec82877
    • N
      fix x32 unistd macros to report as ILP32 not LP64 · 13127680
      Nicholas Wilson 提交于
      13127680
    • R
      fix data race in at_quick_exit · 64303156
      Rich Felker 提交于
      aside from theoretical arbitrary results due to UB, this could
      practically cause unbounded overflow of static array if hit, but
      hitting it depends on having more than 32 calls to at_quick_exit and
      having them sufficiently often.
      64303156
  5. 13 12月, 2017 1 次提交
  6. 12 12月, 2017 1 次提交
    • T
      implement strftime padding specifier extensions · 8a6bd730
      Timo Teräs 提交于
      notes added by maintainer:
      
      the '-' specifier allows default padding to be suppressed, and '_'
      allows padding with spaces instead of the default (zeros).
      
      these extensions seem to be included in several other implementations
      including FreeBSD and derivatives, and Solaris. while portable
      software should not depend on them, time format strings are often
      exposed to the user for configurable time display. reportedly some
      python programs also use and depend on them.
      8a6bd730
  7. 07 12月, 2017 2 次提交
    • R
      adjust fopencookie structure tag for ABI-compat · 2488d31f
      Rich Felker 提交于
      stdio types use the struct tag names from glibc libio to match C++
      ABI.
      2488d31f
    • W
      implement the fopencookie extension to stdio · 06184334
      William Pitcock 提交于
      notes added by maintainer:
      
      this function is a GNU extension. it was chosen over the similar BSD
      function funopen because the latter depends on fpos_t being an
      arithmetic type as part of its public API, conflicting with our
      definition of fpos_t and with the intent that it be an opaque type. it
      was accepted for inclusion because, despite not being widely used, it
      is usually very difficult to extricate software using it from the
      dependency on it.
      
      calling pattern for the read and write callbacks is not likely to
      match glibc or other implementations, but should work with any
      reasonable callbacks. in particular the read function is never called
      without at least one byte being needed to satisfy its caller, so that
      spurious blocking is not introduced.
      
      contracts for what callbacks called from inside libc/stdio can do are
      always complicated, and at some point still need to be specified
      explicitly. at the very least, the callbacks must return or block
      indefinitely (they cannot perform nonlocal exits) and they should not
      make calls to stdio using their own FILE as an argument.
      06184334
  8. 21 11月, 2017 2 次提交
    • R
      make fgetwc handling of encoding errors consistent with/without buffer · 4000b010
      Rich Felker 提交于
      previously, fgetwc left all but the first byte of an illegal sequence
      unread (available for subsequent calls) when reading out of the FILE
      buffer, but dropped all bytes contibuting to the error when falling
      back to reading a byte at a time. neither behavior was ideal. in the
      buffered case, each malformed character produced one error per byte,
      rather than one per character. in the unbuffered case, consuming the
      last byte that caused the transition from "incomplete" to "invalid"
      state potentially dropped (and produced additional spurious encoding
      errors for) the next valid character.
      
      to handle both cases uniformly without duplicate code, revise the
      buffered case to only cover situations where a complete and valid
      character is present in the buffer, and fall back to byte-at-a-time
      for all other cases. this allows using mbtowc (stateless) instead of
      mbrtowc, which may slightly improve performance too.
      
      when an encoding error has been hit in the byte-at-a-time case, leave
      the final byte that produced the error unread (via ungetc) except in
      the case of single-byte errors (for UTF-8, bytes c0, c1, f5-ff, and
      continuation bytes with no lead byte). single-byte errors are fully
      consumed so as not to leave the caller in an infinite loop repeating
      the same error.
      
      none of these changes are distinguished from a conformance standpoint,
      since the file position is unspecified after encoding errors. they are
      intended merely as QoI/consistency improvements.
      4000b010
    • R
      fix treatment by fgetws of encoding errors as eof · a90d9da1
      Rich Felker 提交于
      fgetwc does not set the stream's error indicator on encoding errors,
      making ferror insufficient to distinguish between error and eof
      conditions. feof is also insufficient, since it will return true if
      the file ended with a partial character encoding error.
      
      whether fgetwc should be setting the error indicator itself is a
      question with conflicting answers. the POSIX text for the function
      states it as a requirement, but the ISO C text seems to require that
      it not. this may be revisited in the future based on the outcome of
      Austin Group issue #1170.
      a90d9da1
  9. 19 11月, 2017 1 次提交
  10. 15 11月, 2017 1 次提交
    • R
      add reverse iconv mappings for JIS-based encodings · a223dbd2
      Rich Felker 提交于
      these encodings are still commonly used in messaging protocols and
      such. the reverse mapping is implemented as a binary search of a list
      of the jis 0208 characters in unicode order; the existing forward
      table is used to perform the comparison in the search.
      a223dbd2
  11. 14 11月, 2017 2 次提交
    • R
      generalize iconv framework for 8-bit codepages · 105eff9d
      Rich Felker 提交于
      previously, 8-bit codepages could only remap the high 128 bytes; the
      low range was assumed/forced to agree with ascii. interpretation of
      codepage table headers has been changed so that it's possible to
      represent mappings for up to 256 slots (fewer if the initial portion
      of the map is elided because it coincides with unicode codepoints).
      this requires consuming a bit more of the 10-bit space of characters
      that can be represented in 8-bit codepages, but there's still a plenty
      left. the size of the legacy_chars table is actually reduced now by
      eliding the first 256 entries and considering them to map implicitly
      via the identity map.
      
      before these changes, there seem to have been minor bugs/omissions in
      codepage table generation, so it's likely that some actual bug fixes
      are silently included in this commit. round-trip testing of a few
      codepages was performed on the new version of the code, but no
      differential testing against the old version was done.
      105eff9d
    • R
      fix malloc state corruption when ldso rejects loading a second libc · a71b46cf
      Rich Felker 提交于
      commit c49d3c8a added logic to detect
      attempts to load libc.so via another name and instead redirect to the
      existing libc, rather than loading two and producing dangerously
      inconsistent state. however, the check for and unmapping of the
      duplicate libc happened after reclaim_gaps was already called,
      donating the slack space around the writable segment to malloc.
      subsequent unmapping of the library then invalidated malloc's free
      lists.
      
      fix the issue by moving the call to reclaim_gaps out of map_library
      into load_library, after the duplicate libc check but before the first
      call to calloc, so that the gaps can still be used to satisfy the
      allocation of struct dso. this change also eliminates the need for an
      ugly hack (temporarily setting runtime=1) to avoid reclaim_gaps when
      loading the main program via map_library, which happens when ldso is
      invoked as a command.
      
      only programs/libraries erroneously containing a DT_NEEDED reference
      to libc.so via an absolute pathname or symlink were affected by this
      issue.
      a71b46cf
  12. 11 11月, 2017 6 次提交
    • R
      reformat cjk iconv tables to be diff-friendly, match tool output · d060edf6
      Rich Felker 提交于
      the new version of the code used to generate these tables forces a
      newline every 256 entries, whereas at the time these files were
      originally generated and committed, it only wrapped them at 80
      columns. the new behavior ensures that localized changes to the
      tables, if they are ever needed, will produce localized diffs. other
      tables including hkscs were already committed in the new format.
      
      binary comparison of the generated object files was performed to
      confirm that no spurious changes slipped in.
      d060edf6
    • B
      prevent fork's errno from being clobbered by atfork handlers · c21051e9
      Bobby Bingham 提交于
      If the syscall fails, errno must be set correctly for the caller.
      There's no guarantee that the handlers registered with pthread_atfork
      won't clobber errno, so we need to ensure it gets set after they are
      called.
      c21051e9
    • R
      add iso-2022-jp support (decoding only) to iconv · a39f20bf
      Rich Felker 提交于
      this implementation aims to match the baseline defined by rfc1468 (the
      original mime charset definition) plus the halfwidth katakana
      extension included in the whatwg definition of the charset. rejection
      of si/so controls and newlines in doublebyte state are not currently
      enforced. the jis x 0201 mode is currently interpreted as having the
      yen sign and overline character in place of backslash and tilde; ascii
      mode has the standard ascii characters in those slots.
      a39f20bf
    • R
      add iconv framework for decoding stateful encodings · 5b546faa
      Rich Felker 提交于
      assuming pointers obtained from malloc have some nonzero alignment,
      repurpose the low bit of iconv_t as an indicator that the descriptor
      is a stateless value representing the source and destination character
      encodings.
      5b546faa
    • R
      simplify/optimize iconv utf-8 case · 0df5b39a
      Rich Felker 提交于
      the special case where mbrtowc returns 0 but consumed 1 byte of input
      does not need to be considered, because the short-circuit for low
      bytes already covered that case.
      0df5b39a
    • R
      handle ascii range individually in each iconv case · 9eb6dd51
      Rich Felker 提交于
      short-circuiting low bytes before the switch precluded support for
      character encodings that don't coincide with ascii in this range. this
      limitation affected iso-2022 encodings, which use the esc byte to
      introduce a shift sequence, and things like ebcdic.
      9eb6dd51
  13. 10 11月, 2017 4 次提交
    • R
      move iconv_close to its own translation unit · bff59d13
      Rich Felker 提交于
      this is in preparation to support stateful conversion descriptors,
      which are necessarily allocated and thus must be freed in iconv_close.
      putting it in a separate TU will avoid pulling in free if iconv_close
      is not referenced.
      bff59d13
    • R
      refactor iconv conversion descriptor encoding/decoding · 79f49eff
      Rich Felker 提交于
      this change is made to avoid having assumptions about the encoding
      spread out across the file, and to facilitate future change to a form
      that can accommodate allocted, stateful descriptors when needed.
      
      this commit should not produce any functional changes; with the
      compiler tested the only change to code generation was minor
      reordering of local variables on stack.
      79f49eff
    • A
      fix getaddrinfo error code for non-numeric service with AI_NUMERICSERV · 30fdda6c
      A. Wilcox 提交于
      If AI_NUMERICSERV is specified and a numeric service was not provided,
      POSIX mandates getaddrinfo return EAI_NONAME. EAI_SERVICE is only for
      services that cannot be used on the specified socket type.
      30fdda6c
    • R
      fix mismatched type of __pthread_tsd_run_dtors weak definition · 67b29947
      Rich Felker 提交于
      commit a6054e3c changed this function
      not to take an argument, but the weak definition used by timer_create
      was not updated to match.
      
      reported by Pascal Cuoq.
      67b29947
  14. 06 11月, 2017 3 次提交