1. 27 5月, 2015 2 次提交
    • R
      overhaul locale internals to treat categories roughly uniformly · 61a3364d
      Rich Felker 提交于
      previously, LC_MESSAGES was treated specially as the only category
      which could be set to a locale name without a definition file, in
      order to facilitate gettext message translations when no libc locale
      was available. LC_NUMERIC was completely un-settable, and LC_CTYPE
      stored a flag intended to be used for a possible future byte-based C
      locale, instead of storing a __locale_map pointer like the other
      categories use.
      
      this patch changes all categories to be represented by pointers to
      __locale_map structures, and allows locale names without definition
      files to be treated as valid locales with trivial definition when used
      in any category. outwardly visible functional changes should be minor,
      limited mainly to the strings read back from setlocale and the way
      gettext handles translations in categories other than LC_MESSAGES.
      
      various internal refactoring has also been performed, and improvements
      in const correctness have been made.
      61a3364d
    • R
      replace atomics with locks in locale-setting code · 63c188ec
      Rich Felker 提交于
      this is part of a general program of removing direct use of atomics
      where they are not necessary to meet correctness or performance needs,
      but in this case it's also an optimization. only the global locale
      needs synchronization; allocated locales referenced with locale_t
      handles are immutable during their lifetimes, and using atomics to
      initialize them increases their cost of setup.
      63c188ec
  2. 04 3月, 2015 1 次提交
    • R
      make all objects used with atomic operations volatile · 56fbaa3b
      Rich Felker 提交于
      the memory model we use internally for atomics permits plain loads of
      values which may be subject to concurrent modification without
      requiring that a special load function be used. since a compiler is
      free to make transformations that alter the number of loads or the way
      in which loads are performed, the compiler is theoretically free to
      break this usage. the most obvious concern is with atomic cas
      constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be
      transformed to a_cas(p,*p,f(*p)); where the latter is intended to show
      multiple loads of *p whose resulting values might fail to be equal;
      this would break the atomicity of the whole operation. but even more
      fundamental breakage is possible.
      
      with the changes being made now, objects that may be modified by
      atomics are modeled as volatile, and the atomic operations performed
      on them by other threads are modeled as asynchronous stores by
      hardware which happens to be acting on the request of another thread.
      such modeling of course does not itself address memory synchronization
      between cores/cpus, but that aspect was already handled. this all
      seems less than ideal, but it's the best we can do without mandating a
      C11 compiler and using the C11 model for atomics.
      
      in the case of pthread_once_t, the ABI type of the underlying object
      is not volatile-qualified. so we are assuming that accessing the
      object through a volatile-qualified lvalue via casts yields volatile
      access semantics. the language of the C standard is somewhat unclear
      on this matter, but this is an assumption the linux kernel also makes,
      and seems to be the correct interpretation of the standard.
      56fbaa3b
  3. 01 8月, 2014 1 次提交
    • R
      harden locale name handling and prevent slashes in LC_MESSAGES · 5059deb1
      Rich Felker 提交于
      the code which loads locale files was already rejecting locale names
      containing slashes. however, LC_MESSAGES records a locale name even if
      libc does not have a matching locale file, so that gettext or
      application code can use the recorded locale name for message
      translations to languages that libc does not support. this recorded
      name was not being checked for slashes, meaning that such code could
      potentially be tricked into directory traversal.
      
      in addition, since the value of a locale category is sometimes used as
      a pathname component by callers, the improved code rejects any value
      beginning with a dot. this prevents traversal to the parent directory
      via "..", use of the top-level locale directory via ".", and also
      avoids "hidden" directories as a side effect.
      
      finally, overly long locale names are now rejected (treated as an
      unrecognized name and thus as an alias for C.UTF-8) rather than being
      truncated.
      5059deb1
  4. 26 7月, 2014 1 次提交
    • R
      implement mo file string lookup for translations · 41421d6b
      Rich Felker 提交于
      the core is based on a binary search; hash table is not used. both
      native and reverse-endian mo files are supported. all offsets read
      from the mapped mo file are checked against the mapping size to
      prevent the possibility of reads outside the mapping.
      
      this commit has no observable effects since there are not yet any
      callers to the message translation code.
      41421d6b
  5. 24 7月, 2014 2 次提交
  6. 03 7月, 2014 1 次提交
    • R
      add locale framework · 0bc03091
      Rich Felker 提交于
      this commit adds non-stub implementations of setlocale, duplocale,
      newlocale, and uselocale, along with the data structures and minimal
      code needed for representing the active locale on a per-thread basis
      and optimizing the common case where thread-local locale settings are
      not in use.
      
      at this point, the data structures only contain what is necessary to
      represent LC_CTYPE (a single flag) and LC_MESSAGES (a name for use in
      finding message translation files). representation for the other
      categories will be added later; the expectation is that a single
      pointer will suffice for each.
      
      for LC_CTYPE, the strings "C" and "POSIX" are treated as special; any
      other string is accepted and treated as "C.UTF-8". for other
      categories, any string is accepted after being truncated to a maximum
      supported length (currently 15 bytes). for LC_MESSAGES, the name is
      kept regardless of whether libc itself can use such a message
      translation locale, since applications using catgets or gettext should
      be able to use message locales libc is not aware of. for other
      categories, names which are not successfully loaded as locales (which,
      at present, means all names) are treated as aliases for "C". setlocale
      never fails.
      
      locale settings are not yet used anywhere, so this commit should have
      no visible effects except for the contents of the string returned by
      setlocale.
      0bc03091