1. 26 5月, 2015 1 次提交
    • R
      move call to dynamic linker stage-3 into stage-2 function · 768b82c6
      Rich Felker 提交于
      this move eliminates a duplicate "by-hand" symbol lookup loop from the
      stage-1 code and replaces it with a call to find_sym, which can be
      used once we're in stage 2. it reduces the size of the stage 1 code,
      which is helpful because stage 1 will become the crt start file for
      static-PIE executables, and it will allow stage 3 to access stage 2's
      automatic storage, which will be important in an upcoming commit.
      768b82c6
  2. 19 5月, 2015 1 次提交
    • R
      reprocess libc/ldso RELA relocations in stage 3 of dynamic linking · c093e2e8
      Rich Felker 提交于
      this fixes a regression on powerpc that was introduced in commit
      f3ddd173. global data accesses on
      powerpc seem to be using a translation-unit-local GOT filled via
      R_PPC_ADDR32 relocations rather than R_PPC_GLOB_DAT. being a non-GOT
      relocation type, these were not reprocessed after adding the main
      application and its libraries to the chain, causing libc code not to
      see copy relocations in the main program, and therefore to use the
      pre-copy-relocation addresses for global data objects (like environ).
      
      the motivation for the dynamic linker only reprocessing GOT/PLT
      relocation types in stage 3 is that these types always have a zero
      addend, making them safe to process again even if the storage for the
      addend has been clobbered. other relocation types which can be used
      for address constants in initialized data objects may have non-zero
      addends which will be clobbered during the first pass of relocation
      processing if they're stored inline (REL form) rather than out-of-line
      (RELA form).
      
      powerpc generally uses only RELA, so this patch is sufficient to fix
      the regression in practice, but is not fully general, and would not
      suffice if an alternate toolchain generated REL for powerpc.
      c093e2e8
  3. 22 4月, 2015 2 次提交
  4. 19 4月, 2015 1 次提交
    • R
      make dlerror state and message thread-local and dynamically-allocated · 01d42747
      Rich Felker 提交于
      this fixes truncation of error messages containing long pathnames or
      symbol names.
      
      the dlerror state was previously required by POSIX to be global. the
      resolution of bug 97 relaxed the requirements to allow thread-safe
      implementations of dlerror with thread-local state and message buffer.
      01d42747
  5. 18 4月, 2015 2 次提交
  6. 14 4月, 2015 6 次提交
    • R
      72b25ddb
    • R
      fix inconsistent visibility for internal __tls_get_new function · bc081f62
      Rich Felker 提交于
      at the point of call it was declared hidden, but the definition was
      not hidden. for some toolchains this inconsistency produced textrels
      without ld-time binding.
      bc081f62
    • R
      remove initializers for decoded aux/dyn arrays in dynamic linker · f4f9562c
      Rich Felker 提交于
      the zero initialization is redundant since decode_vec does its own
      clearing, and it increases the risk that buggy compilers will generate
      calls to memset. as long as symbols are bound at ld time, such a call
      will not break anything, but it may be desirable to turn off ld-time
      binding in the future.
      f4f9562c
    • R
      remove remnants of support for running in no-thread-pointer mode · 19a1fe67
      Rich Felker 提交于
      since 1.1.0, musl has nominally required a thread pointer to be setup.
      most of the remaining code that was checking for its availability was
      doing so for the sake of being usable by the dynamic linker. as of
      commit 71f099cb, this is no longer
      necessary; the thread pointer is now valid before any libc code
      (outside of dynamic linker bootstrap functions) runs.
      
      this commit essentially concludes "phase 3" of the "transition path
      for removing lazy init of thread pointer" project that began during
      the 1.1.0 release cycle.
      19a1fe67
    • R
      move thread pointer setup to beginning of dynamic linker stage 3 · 71f099cb
      Rich Felker 提交于
      this allows the dynamic linker itself to run with a valid thread
      pointer, which is a prerequisite for stack protector on archs where
      the ssp canary is stored in TLS. it will also allow us to remove some
      remaining runtime checks for whether the thread pointer is valid.
      
      as long as the application and its libraries do not require additional
      size or alignment, this early thread pointer will be kept and reused
      at runtime. otherwise, a new static TLS block is allocated after
      library loading has finished and the thread pointer is switched over.
      71f099cb
    • R
      stabilize dynamic linker's layout of static TLS · 0f66fcec
      Rich Felker 提交于
      previously, the layout of the static TLS block was perturbed by the
      size of the dtv; dtv size increasing from 0 to 1 perturbed both TLS
      arch types, and the TLS-above-TP type's layout was perturbed by the
      specific number of dtv slots (libraries with TLS). this behavior made
      it virtually impossible to setup a tentative thread pointer address
      before loading libraries and keep it unchanged as long as the
      libraries' TLS size/alignment requirements fit.
      
      the new code fixes the location of the dtv and pthread structure at
      opposite ends of the static TLS block so that they will not move
      unless size or alignment changes.
      0f66fcec
  7. 13 4月, 2015 1 次提交
    • R
      dynamic linker bootstrap overhaul · f3ddd173
      Rich Felker 提交于
      this overhaul further reduces the amount of arch-specific code needed
      by the dynamic linker and removes a number of assumptions, including:
      
      - that symbolic function references inside libc are bound at link time
        via the linker option -Bsymbolic-functions.
      
      - that libc functions used by the dynamic linker do not require
        access to data symbols.
      
      - that static/internal function calls and data accesses can be made
        without performing any relocations, or that arch-specific startup
        code handled any such relocations needed.
      
      removing these assumptions paves the way for allowing libc.so itself
      to be built with stack protector (among other things), and is achieved
      by a three-stage bootstrap process:
      
      1. relative relocations are processed with a flat function.
      2. symbolic relocations are processed with no external calls/data.
      3. main program and dependency libs are processed with a
         fully-functional libc/ldso.
      
      reduction in arch-specific code is achived through the following:
      
      - crt_arch.h, used for generating crt1.o, now provides the entry point
        for the dynamic linker too.
      
      - asm is no longer responsible for skipping the beginning of argv[]
        when ldso is invoked as a command.
      
      - the functionality previously provided by __reloc_self for heavily
        GOT-dependent RISC archs is now the arch-agnostic stage-1.
      
      - arch-specific relocation type codes are mapped directly as macros
        rather than via an inline translation function/switch statement.
      f3ddd173
  8. 04 4月, 2015 2 次提交
    • R
      fix rpath string memory leak on failed dlopen · 07709625
      Rich Felker 提交于
      when dlopen fails, all partially-loaded libraries need to be unmapped
      and freed. any of these libraries using an rpath with $ORIGIN
      expansion may have an allocated string for the expanded rpath;
      previously, this string was not freed when freeing the library data
      structures.
      07709625
    • R
      halt dynamic linker library search on errors resolving $ORIGIN in rpath · 2963a9f7
      Rich Felker 提交于
      this change hardens the dynamic linker against the possibility of
      loading the wrong library due to inability to expand $ORIGIN in rpath.
      hard failures such as excessively long paths or absence of /proc (when
      resolving /proc/self/exe for the main executable's origin) do not stop
      the path search, but memory allocation failures and any other
      potentially transient failures do.
      
      to implement this change, the meaning of the return value of
      fixup_rpath function is changed. returning zero no longer indicates
      that the dso's rpath string pointer is non-null; instead, the caller
      needs to check. a return value of -1 indicates a failure that should
      stop further path search.
      2963a9f7
  9. 02 4月, 2015 1 次提交
    • R
      harden dynamic linker library path search · 5d1c8c99
      Rich Felker 提交于
      transient errors during the path search should not allow the search to
      continue and possibly open the wrong file. this patch eliminates most
      conditions where that could happen, but there is still a possibility
      that $ORIGIN-based rpath processing will have an allocation failure,
      causing the search to skip such a path. fixing this is left as a
      separate task.
      
      a small bug where overly-long path components caused an infinite loop
      rather than being skipped/ignored is also fixed.
      5d1c8c99
  10. 12 3月, 2015 1 次提交
    • S
      copy the dtv pointer to the end of the pthread struct for TLS_ABOVE_TP archs · 204a69d2
      Szabolcs Nagy 提交于
      There are two main abi variants for thread local storage layout:
      
       (1) TLS is above the thread pointer at a fixed offset and the pthread
       struct is below that. So the end of the struct is at known offset.
      
       (2) the thread pointer points to the pthread struct and TLS starts
       below it. So the start of the struct is at known (zero) offset.
      
      Assembly code for the dynamic TLSDESC callback needs to access the
      dynamic thread vector (dtv) pointer which is currently at the front
      of the pthread struct. So in case of (1) the asm code needs to hard
      code the offset from the end of the struct which can easily break if
      the struct changes.
      
      This commit adds a copy of the dtv at the end of the struct. New members
      must not be added after dtv_copy, only before it. The size of the struct
      is increased a bit, but there is opportunity for size optimizations.
      204a69d2
  11. 07 3月, 2015 1 次提交
    • R
      fix over-alignment of TLS, insufficient builtin TLS on 64-bit archs · bd67959f
      Rich Felker 提交于
      a conservative estimate of 4*sizeof(size_t) was used as the minimum
      alignment for thread-local storage, despite the only requirements
      being alignment suitable for struct pthread and void* (which struct
      pthread already contains). additional alignment required by the
      application or libraries is encoded in their headers and is already
      applied.
      
      over-alignment prevented the builtin_tls array from ever being used in
      dynamic-linked programs on 64-bit archs, thereby requiring allocation
      at startup even in programs with no TLS of their own.
      bd67959f
  12. 04 3月, 2015 1 次提交
    • R
      make all objects used with atomic operations volatile · 56fbaa3b
      Rich Felker 提交于
      the memory model we use internally for atomics permits plain loads of
      values which may be subject to concurrent modification without
      requiring that a special load function be used. since a compiler is
      free to make transformations that alter the number of loads or the way
      in which loads are performed, the compiler is theoretically free to
      break this usage. the most obvious concern is with atomic cas
      constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be
      transformed to a_cas(p,*p,f(*p)); where the latter is intended to show
      multiple loads of *p whose resulting values might fail to be equal;
      this would break the atomicity of the whole operation. but even more
      fundamental breakage is possible.
      
      with the changes being made now, objects that may be modified by
      atomics are modeled as volatile, and the atomic operations performed
      on them by other threads are modeled as asynchronous stores by
      hardware which happens to be acting on the request of another thread.
      such modeling of course does not itself address memory synchronization
      between cores/cpus, but that aspect was already handled. this all
      seems less than ideal, but it's the best we can do without mandating a
      C11 compiler and using the C11 model for atomics.
      
      in the case of pthread_once_t, the ABI type of the underlying object
      is not volatile-qualified. so we are assuming that accessing the
      object through a volatile-qualified lvalue via casts yields volatile
      access semantics. the language of the C standard is somewhat unclear
      on this matter, but this is an assumption the linux kernel also makes,
      and seems to be the correct interpretation of the standard.
      56fbaa3b
  13. 24 11月, 2014 1 次提交
    • R
      adapt dynamic linker for new binutils versions that omit DT_RPATH · d8dc2b7c
      Rich Felker 提交于
      the new DT_RUNPATH semantics for search order are always used, and
      since binutils had always set both DT_RPATH and DT_RUNPATH when the
      latter was used, processing only DT_RPATH worked fine. however, recent
      binutils has stopped generating DT_RPATH when DT_RUNPATH is used,
      which broke support for this feature completely.
      d8dc2b7c
  14. 19 11月, 2014 1 次提交
  15. 08 8月, 2014 2 次提交
  16. 11 7月, 2014 2 次提交
  17. 30 6月, 2014 2 次提交
    • R
      fix regression in mips dynamic linker · 2d8cc92a
      Rich Felker 提交于
      this issue caused the address of functions in shared libraries to
      resolve to their PLT thunks in the main program rather than their
      correct addresses. it was observed causing crashes, though the
      mechanism of the crash was not thoroughly investigated. since the
      issue is very subtle, it calls for some explanation:
      
      on all well-behaved archs, GOT entries that belong to the PLT use a
      special relocation type, typically called JMP_SLOT, so that the
      dynamic linker can avoid having the jump destinations for the PLT
      resolve to PLT thunks themselves (they also provide a definition for
      the symbol, which must be used whenever the address of the function is
      taken so that all DSOs see the same address).
      
      however, the traditional mips PIC ABI lacked such a JMP_SLOT
      relocation type, presumably because, due to the way PIC works, the
      address of the PLT thunk was never needed and could always be ignored.
      
      prior to commit adf94c19, the mips
      version of reloc.h contained a hack that caused all symbol lookups to
      be treated like JMP_SLOT, inhibiting undefined symbols from ever being
      used to resolve symbolic relocations. this hack goes all the way back
      to commit babf8201, when the mips
      dynamic linker was first made usable.
      
      during the recent refactoring to eliminate arch-specific relocation
      processing (commit adf94c19), this
      hack was overlooked and no equivalent functionality was provided in
      the new code.
      
      fixing the problem is not as simple as adding back an equivalent hack,
      since there is now also a "non-PIC ABI" that can be used for the main
      executable, which actually does use a PLT. the closest thing to
      official documentation I could find for this ABI is nonpic.txt,
      attached to Message-ID: 20080701202236.GA1534@caradoc.them.org, which
      can be found in the gcc mailing list archives and elsewhere. per this
      document, undefined symbols corresponding to PLT thunks have the
      STO_MIPS_PLT bit set in the symbol's st_other field. thus, I have
      added an arch-specific rule for mips, applied at the find_sym level
      rather than the relocation level, to reject undefined symbols with the
      STO_MIPS_PLT bit clear.
      
      the previous hack of treating all mips relocations as JMP_SLOT-like,
      rather than rejecting the unwanted symbols in find_sym, probably also
      caused dlsym to wrongly return PLT thunks in place of the correct
      address of a function under at least some conditions. this should now
      be fixed, at least for global-scope symbol lookups.
      2d8cc92a
    • R
      fix regression in dynamic linker error reporting · 9a4ad022
      Rich Felker 提交于
      due to a mistake when refactoring the error printing for the dynamic
      linker (commit 7c73cacd), all messages
      were suppressed and replaced by blank lines.
      9a4ad022
  18. 19 6月, 2014 3 次提交
    • R
      separate __tls_get_addr implementation from dynamic linker/init_tls · 5ba238e1
      Rich Felker 提交于
      such separation serves multiple purposes:
      
      - by having the common path for __tls_get_addr alone in its own
        function with a tail call to the slow case, code generation is
        greatly improved.
      
      - by having __tls_get_addr in it own file, it can be replaced on a
        per-arch basis as needed, for optimization or ABI-specific purposes.
      
      - by removing __tls_get_addr from __init_tls.c, a few bytes of code
        are shaved off of static binaries (which are unlikely to use this
        function unless the linker messed up).
      5ba238e1
    • R
      change dynamic TLS installation strategy to optimize access · e75b16cf
      Rich Felker 提交于
      previously, accesses to dynamic TLS had to check two conditions before
      being able to use a dtv slot: (1) that the module index was within the
      bounds of the current dtv size, and (2) that the dynamic tls for the
      requested module index was already installed in the dtv.
      
      this commit changes the installation strategy so that, whenever an
      attempt is made to access dynamic TLS that's not yet installed in the
      dtv, the dynamic TLS for all lower-index modules is also installed.
      thus it provides a new invariant: if a given module index is within
      the bounds of the current dtv size, we automatically know that its TLS
      is installed and directly available. the requirement that the second
      condition (above) be checked is eliminated.
      e75b16cf
    • R
      add arch-generic support for tlsdesc relocations to dynamic linker · 9d15d5e7
      Rich Felker 提交于
      this code is non-functional without further changes to link up the
      arch-specific reloc types for tlsdesc and add asm implementations of
      __tlsdesc_static and __tlsdesc_dynamic.
      9d15d5e7
  19. 18 6月, 2014 2 次提交
    • R
      reduce code duplication in dynamic linker error paths · 7c73cacd
      Rich Felker 提交于
      eventually this should help making dlerror thread-safe too.
      7c73cacd
    • R
      refactor to remove arch-specific relocation code from dynamic linker · adf94c19
      Rich Felker 提交于
      this was one of the main instances of ugly code duplication: all archs
      use basically the same types of relocations, but roughly equivalent
      logic was duplicated for each arch to account for the different naming
      and numbering of relocation types and variation in whether REL or RELA
      records are used.
      
      as an added bonus, both REL and RELA are now supported on all archs,
      regardless of which is used by the standard toolchain.
      adf94c19
  20. 17 4月, 2014 1 次提交
    • R
      add options when explicitly invoking dynamic loader · de45164e
      Rich Felker 提交于
      so far the options are --library-path and --preload which override the
      corresponding environment variables, and --list which forces the
      behavior of ldd even if the invocation name is not ldd. both the
      two-arg form and the one-arg form using an equals sign are supported.
      
      based loosely on a patch proposed by Rune.
      de45164e
  21. 26 3月, 2014 5 次提交
  22. 25 3月, 2014 1 次提交
    • R
      always initialize thread pointer at program start · dab441ae
      Rich Felker 提交于
      this is the first step in an overhaul aimed at greatly simplifying and
      optimizing everything dealing with thread-local state.
      
      previously, the thread pointer was initialized lazily on first access,
      or at program startup if stack protector was in use, or at certain
      random places where inconsistent state could be reached if it were not
      initialized early. while believed to be fully correct, the logic was
      fragile and non-obvious.
      
      in the first phase of the thread pointer overhaul, support is retained
      (and in some cases improved) for systems/situation where loading the
      thread pointer fails, e.g. old kernels.
      
      some notes on specific changes:
      
      - the confusing use of libc.main_thread as an indicator that the
        thread pointer is initialized is eliminated in favor of an explicit
        has_thread_pointer predicate.
      
      - sigaction no longer needs to ensure that the thread pointer is
        initialized before installing a signal handler (this was needed to
        prevent a situation where the signal handler caused the thread
        pointer to be initialized and the subsequent sigreturn cleared it
        again) but it still needs to ensure that implementation-internal
        thread-related signals are not blocked.
      
      - pthread tsd initialization for the main thread is deferred in a new
        manner to minimize bloat in the static-linked __init_tp code.
      
      - pthread_setcancelstate no longer needs special handling for the
        situation before the thread pointer is initialized. it simply fails
        on systems that cannot support a thread pointer, which are
        non-conforming anyway.
      
      - pthread_cleanup_push/pop now check for missing thread pointer and
        nop themselves out in this case, so stdio no longer needs to avoid
        the cancellable path when the thread pointer is not available.
      
      a number of cases remain where certain interfaces may crash if the
      system does not support a thread pointer. at this point, these should
      be limited to pthread interfaces, and the number of such cases should
      be fewer than before.
      dab441ae