1. 10 4月, 2015 1 次提交
    • R
      redesign and simplify vmlock system · f08ab9e6
      Rich Felker 提交于
      this global lock allows certain unlock-type primitives to exclude
      mmap/munmap operations which could change the identity of virtual
      addresses while references to them still exist.
      
      the original design mistakenly assumed mmap/munmap would conversely
      need to exclude the same operations which exclude mmap/munmap, so the
      vmlock was implemented as a sort of 'symmetric recursive rwlock'. this
      turned out to be unnecessary.
      
      commit 25d12fc0 already shortened the
      interval during which mmap/munmap held their side of the lock, but
      left the inappropriate lock design and some inefficiency.
      
      the new design uses a separate function, __vm_wait, which does not
      hold any lock itself and only waits for lock users which were already
      present when it was called to release the lock. this is sufficient
      because of the way operations that need to be excluded are sequenced:
      the "unlock-type" operations using the vmlock need only block
      mmap/munmap operations that are precipitated by (and thus sequenced
      after) the atomic-unlock they perform while holding the vmlock.
      
      this allows for a spectacular lack of synchronization in the __vm_wait
      function itself.
      f08ab9e6
  2. 04 3月, 2015 1 次提交
    • R
      make all objects used with atomic operations volatile · 56fbaa3b
      Rich Felker 提交于
      the memory model we use internally for atomics permits plain loads of
      values which may be subject to concurrent modification without
      requiring that a special load function be used. since a compiler is
      free to make transformations that alter the number of loads or the way
      in which loads are performed, the compiler is theoretically free to
      break this usage. the most obvious concern is with atomic cas
      constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be
      transformed to a_cas(p,*p,f(*p)); where the latter is intended to show
      multiple loads of *p whose resulting values might fail to be equal;
      this would break the atomicity of the whole operation. but even more
      fundamental breakage is possible.
      
      with the changes being made now, objects that may be modified by
      atomics are modeled as volatile, and the atomic operations performed
      on them by other threads are modeled as asynchronous stores by
      hardware which happens to be acting on the request of another thread.
      such modeling of course does not itself address memory synchronization
      between cores/cpus, but that aspect was already handled. this all
      seems less than ideal, but it's the best we can do without mandating a
      C11 compiler and using the C11 model for atomics.
      
      in the case of pthread_once_t, the ABI type of the underlying object
      is not volatile-qualified. so we are assuming that accessing the
      object through a volatile-qualified lvalue via casts yields volatile
      access semantics. the language of the C standard is somewhat unclear
      on this matter, but this is an assumption the linux kernel also makes,
      and seems to be the correct interpretation of the standard.
      56fbaa3b
  3. 26 8月, 2014 1 次提交
    • R
      sanitize number of spins in userspace before futex wait · b8a9c90e
      Rich Felker 提交于
      the previous spin limit of 10000 was utterly unreasonable.
      empirically, it could consume up to 200000 cycles, whereas a failed
      futex wait (EAGAIN) typically takes 1000 cycles or less, and even a
      true wait/wake round seems much less expensive.
      
      the new counts (100 for general wait, 200 in barrier) were simply
      chosen to be in the range of what's reasonable without having adverse
      effects on casual micro-benchmark tests I have been running. they may
      still be too high, from a standpoint of not wasting cpu cycles, but at
      least they're a lot better than before. rigorous testing across
      different archs and cpu models should be performed at some point to
      determine whether further adjustments should be made.
      b8a9c90e
  4. 23 8月, 2014 1 次提交
  5. 16 8月, 2014 1 次提交
    • R
      make futex operations use private-futex mode when possible · bc09d58c
      Rich Felker 提交于
      private-futex uses the virtual address of the futex int directly as
      the hash key rather than requiring the kernel to resolve the address
      to an underlying backing for the mapping in which it lies. for certain
      usage patterns it improves performance significantly.
      
      in many places, the code using futex __wake and __wait operations was
      already passing a correct fixed zero or nonzero flag for the priv
      argument, so no change was needed at the site of the call, only in the
      __wake and __wait functions themselves. in other places, especially
      where the process-shared attribute for a synchronization object was
      not previously tracked, additional new code is needed. for mutexes,
      the only place to store the flag is in the type field, so additional
      bit masking logic is needed for accessing the type.
      
      for non-process-shared condition variable broadcasts, the futex
      requeue operation is unable to requeue from a private futex to a
      process-shared one in the mutex structure, so requeue is simply
      disabled in this case by waking all waiters.
      
      for robust mutexes, the kernel always performs a non-private wake when
      the owner dies. in order not to introduce a behavioral regression in
      non-process-shared robust mutexes (when the owning thread dies), they
      are simply forced to be treated as process-shared for now, giving
      correct behavior at the expense of performance. this can be fixed by
      adding explicit code to pthread_exit to do the right thing for
      non-shared robust mutexes in userspace rather than relying on the
      kernel to do it, and will be fixed in this way later.
      
      since not all supported kernels have private futex support, the new
      code detects EINVAL from the futex syscall and falls back to making
      the call without the private flag. no attempt to cache the result is
      made; caching it and using the cached value efficiently is somewhat
      difficult, and not worth the complexity when the benefits would be
      seen only on ancient kernels which have numerous other limitations and
      bugs anyway.
      bc09d58c
  6. 18 8月, 2012 1 次提交
    • R
      fix extremely rare but dangerous race condition in robust mutexes · da8d0fc4
      Rich Felker 提交于
      if new shared mappings of files/devices/shared memory can be made
      between the time a robust mutex is unlocked and its subsequent removal
      from the pending slot in the robustlist header, the kernel can
      inadvertently corrupt data in the newly-mapped pages when the process
      terminates. i am fixing the bug by using the same global vm lock
      mechanism that was used to fix the race condition with unmapping
      barriers after pthread_barrier_wait returns.
      da8d0fc4
  7. 29 9月, 2011 3 次提交
    • R
      fix excessive/insufficient wakes in __vm_unlock · de543b05
      Rich Felker 提交于
      there is no need to send a wake when the lock count does not hit zero,
      but when it does, all waiters must be woken (since all with the same
      sign are eligible to obtain the lock).
      de543b05
    • R
      improve pshared barriers · 9cee9307
      Rich Felker 提交于
      eliminate the sequence number field and instead use the counter as the
      futex because of the way the lock is held, sequence numbers are
      completely useless, and this frees up a field in the barrier structure
      to be used as a waiter count for the count futex, which lets us avoid
      some syscalls in the best case.
      
      as of now, self-synchronized destruction and unmapping should be fully
      safe. before any thread can return from the barrier, all threads in
      the barrier have obtained the vm lock, and each holds a shared lock on
      the barrier. the barrier memory is not inspected after the shared lock
      count reaches 0, nor after the vm lock is released.
      9cee9307
    • R
      next step making barrier self-sync'd destruction safe · 95b14796
      Rich Felker 提交于
      i think this works, but it can be simplified. (next step)
      95b14796
  8. 28 9月, 2011 3 次提交
  9. 07 5月, 2011 2 次提交
    • R
      remove debug code that was missed in barrier commit · 9dd6399c
      Rich Felker 提交于
      9dd6399c
    • R
      completely new barrier implementation, addressing major correctness issues · f16a3089
      Rich Felker 提交于
      the previous implementation had at least 2 problems:
      
      1. the case where additional threads reached the barrier before the
      first wave was finished leaving the barrier was untested and seemed
      not to be working.
      
      2. threads leaving the barrier continued to access memory within the
      barrier object after other threads had successfully returned from
      pthread_barrier_wait. this could lead to memory corruption or crashes
      if the barrier object had automatic storage in one of the waiting
      threads and went out of scope before all threads finished returning,
      or if one thread unmapped the memory in which the barrier object
      lived.
      
      the new implementation avoids both problems by making the barrier
      state essentially local to the first thread which enters the barrier
      wait, and forces that thread to be the last to return.
      f16a3089
  10. 18 2月, 2011 1 次提交
    • R
      reorganize pthread data structures and move the definitions to alltypes.h · e8827563
      Rich Felker 提交于
      this allows sys/types.h to provide the pthread types, as required by
      POSIX. this design also facilitates forcing ABI-compatible sizes in
      the arch-specific alltypes.h, while eliminating the need for
      developers changing the internals of the pthread types to poke around
      with arch-specific headers they may not be able to test.
      e8827563
  11. 12 2月, 2011 1 次提交