1. 24 2月, 2016 1 次提交
  2. 08 12月, 2015 3 次提交
  3. 05 12月, 2015 1 次提交
    • P
      rcu: Add rcu_normal kernel parameter to suppress expediting · 5a9be7c6
      Paul E. McKenney 提交于
      Although expedited grace periods can be quite useful, and although their
      OS jitter has been greatly reduced, they can still pose problems for
      extreme real-time workloads.  This commit therefore adds a rcu_normal
      kernel boot parameter (which can also be manipulated via sysfs)
      to suppress expedited grace periods, that is, to treat requests for
      expedited grace periods as if they were requests for normal grace periods.
      If both rcu_expedited and rcu_normal are specified, rcu_normal wins.
      This means that if you are relying on expedited grace periods to speed up
      boot, you will want to specify rcu_expedited on the kernel command line,
      and then specify rcu_normal via sysfs once boot completes.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5a9be7c6
  4. 07 10月, 2015 4 次提交
    • P
      rcu: Remove deprecated rcu_lockdep_assert() · e62e3f62
      Paul E. McKenney 提交于
      The old rcu_lockdep_assert() was retained to ease handling of incoming
      patches, but any use will result in deprecated warnings.  However, its
      replacement, RCU_LOCKDEP_WARN(), is now upstream.  It is therefore
      time to remove rcu_lockdep_assert(), which this commit does.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      e62e3f62
    • P
      rcu: Add rcu_pointer_handoff() · c3ac7cf1
      Paul E. McKenney 提交于
      This commit adds an rcu_pointer_handoff() that is intended to mark
      situations where a structure's protection transitions from RCU to some
      other mechanism (locking, reference counting, whatever).  These markings
      should allow external tools to more easily spot bugs involving leaking
      pointers out of RCU read-side critical sections.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      c3ac7cf1
    • B
      rcu: Don't disable preemption for Tiny and Tree RCU readers · bb73c52b
      Boqun Feng 提交于
      Because preempt_disable() maps to barrier() for non-debug builds,
      it forces the compiler to spill and reload registers.  Because Tree
      RCU and Tiny RCU now only appear in CONFIG_PREEMPT=n builds, these
      barrier() instances generate needless extra code for each instance of
      rcu_read_lock() and rcu_read_unlock().  This extra code slows down Tree
      RCU and bloats Tiny RCU.
      
      This commit therefore removes the preempt_disable() and preempt_enable()
      from the non-preemptible implementations of __rcu_read_lock() and
      __rcu_read_unlock(), respectively.  However, for debug purposes,
      preempt_disable() and preempt_enable() are still invoked if
      CONFIG_PREEMPT_COUNT=y, because this allows detection of sleeping inside
      atomic sections in non-preemptible kernels.
      
      However, Tiny and Tree RCU operates by coalescing all RCU read-side
      critical sections on a given CPU that lie between successive quiescent
      states.  It is therefore necessary to compensate for removing barriers
      from __rcu_read_lock() and __rcu_read_unlock() by adding them to a
      couple of the RCU functions invoked during quiescent states, namely to
      rcu_all_qs() and rcu_note_context_switch().  However, note that the latter
      is more paranoia than necessity, at least until link-time optimizations
      become more aggressive.
      
      This is based on an earlier patch by Paul E. McKenney, fixing
      a bug encountered in kernels built with CONFIG_PREEMPT=n and
      CONFIG_PREEMPT_COUNT=y.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bb73c52b
    • B
      rcu: Use rcu_callback_t in call_rcu*() and friends · b6a4ae76
      Boqun Feng 提交于
      As we now have rcu_callback_t typedefs as the type of rcu callbacks, we
      should use it in call_rcu*() and friends as the type of parameters. This
      could save us a few lines of code and make it clear which function
      requires an rcu callbacks rather than other callbacks as its argument.
      
      Besides, this can also help cscope to generate a better database for
      code reading.
      Signed-off-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      b6a4ae76
  5. 21 9月, 2015 1 次提交
  6. 23 7月, 2015 3 次提交
  7. 16 7月, 2015 1 次提交
    • D
      rcu: Deinline rcu_read_lock_sched_held() if DEBUG_LOCK_ALLOC · d5671f6b
      Denys Vlasenko 提交于
      DEBUG_LOCK_ALLOC=y is not a production setting, but it is
      not very unusual either. Many developers routinely
      use kernels built with it enabled.
      
      Apart from being selected by hand, it is also auto-selected by
      PROVE_LOCKING "Lock debugging: prove locking correctness" and
      LOCK_STAT "Lock usage statistics" config options.
      LOCK STAT is necessary for "perf lock" to work.
      
      I wouldn't spend too much time optimizing it, but this particular
      function has a very large cost in code size: when it is deinlined,
      code size decreases by 830,000 bytes:
      
          text     data      bss       dec     hex filename
      85674192 22294776 20627456 128596424 7aa39c8 vmlinux.before
      84837612 22294424 20627456 127759492 79d7484 vmlinux
      
      (with this config: http://busybox.net/~vda/kernel_config)
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      CC: Josh Triplett <josh@joshtriplett.org>
      CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      CC: Lai Jiangshan <laijs@cn.fujitsu.com>
      CC: Tejun Heo <tj@kernel.org>
      CC: Oleg Nesterov <oleg@redhat.com>
      CC: linux-kernel@vger.kernel.org
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      d5671f6b
  8. 07 7月, 2015 1 次提交
  9. 28 5月, 2015 5 次提交
  10. 22 4月, 2015 1 次提交
  11. 13 3月, 2015 1 次提交
    • P
      rcu: Handle outgoing CPUs on exit from idle loop · 88428cc5
      Paul E. McKenney 提交于
      This commit informs RCU of an outgoing CPU just before that CPU invokes
      arch_cpu_idle_dead() during its last pass through the idle loop (via a
      new CPU_DYING_IDLE notifier value).  This change means that RCU need not
      deal with outgoing CPUs passing through the scheduler after informing
      RCU that they are no longer online.  Note that removing the CPU from
      the rcu_node ->qsmaskinit bit masks is done at CPU_DYING_IDLE time,
      and orphaning callbacks is still done at CPU_DEAD time, the reason being
      that at CPU_DEAD time we have another CPU that can adopt them.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      88428cc5
  12. 04 3月, 2015 2 次提交
    • P
      rcu: Reverse rcu_dereference_check() conditions · b826565a
      Paul E. McKenney 提交于
      The rcu_dereference_check() family of primitives evaluates the RCU
      lockdep expression first, and only then evaluates the expression passed
      in.  This works fine normally, but can potentially fail in environments
      (such as NMI handlers) where lockdep cannot be invoked.  The problem is
      that even if the expression passed in is "1", the compiler would need to
      prove that the RCU lockdep expression (rcu_read_lock_held(), for example)
      is free of side effects in order to be able to elide it.  Given that
      rcu_read_lock_held() is sometimes separately compiled, the compiler cannot
      always use this optimization.
      
      This commit therefore reverse the order of evaluation, so that the
      expression passed in is evaluated first, and the RCU lockdep expression is
      evaluated only if the passed-in expression evaluated to false, courtesy
      of the C-language short-circuit boolean evaluation rules.  This compells
      the compiler to forego executing the RCU lockdep expression in cases
      where the passed-in expression evaluates to "1" at compile time, so that
      (for example) rcu_dereference_raw() can be guaranteed to execute safely
      within an NMI handler.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      b826565a
    • P
      rcu: Improve diagnostics for blocked critical sections in irq · d24209bb
      Paul E. McKenney 提交于
      If an RCU read-side critical section occurs within an interrupt handler
      or a softirq handler, it cannot have been preempted.  Therefore, there is
      a check in rcu_read_unlock_special() checking for this error.  However,
      when this check triggers, it lacks diagnostic information.  This commit
      therefore moves rcu_read_unlock()'s lockdep annotation to follow the
      call to __rcu_read_unlock() and changes rcu_read_unlock_special()'s
      WARN_ON_ONCE() to an lockdep_rcu_suspicious() in order to locate where
      the offending RCU read-side critical section began.  In addition, the
      value of the ->rcu_read_unlock_special field is printed.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      d24209bb
  13. 27 2月, 2015 2 次提交
    • P
      rcu: Add Kconfig option to expedite grace periods during boot · ee42571f
      Paul E. McKenney 提交于
      This commit adds a CONFIG_RCU_EXPEDITE_BOOT Kconfig parameter
      that emulates a very early boot rcu_expedite_gp().  A late-boot
      call to rcu_end_inkernel_boot() will provide the corresponding
      rcu_unexpedite_gp().  The late-boot call to rcu_end_inkernel_boot()
      should be made just before init is spawned.
      
      According to Arjan:
      
      > To show the boot time, I'm using the timestamp of the "Write protecting"
      > line, that's pretty much the last thing we print prior to ring 3 execution.
      >
      > A kernel with default RCU behavior (inside KVM, only virtual devices)
      > looks like this:
      >
      > [    0.038724] Write protecting the kernel read-only data: 10240k
      >
      > a kernel with expedited RCU (using the command line option, so that I
      > don't have to recompile between measurements and thus am completely
      > oranges-to-oranges)
      >
      > [    0.031768] Write protecting the kernel read-only data: 10240k
      >
      > which, in percentage, is an 18% improvement.
      Reported-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NArjan van de Ven <arjan@linux.intel.com>
      ee42571f
    • P
      rcu: Provide rcu_expedite_gp() and rcu_unexpedite_gp() · 0d39482c
      Paul E. McKenney 提交于
      Currently, expediting of normal synchronous grace-period primitives
      (synchronize_rcu() and friends) is controlled by the rcu_expedited()
      boot/sysfs parameter.  This works well, but does not handle nesting.
      This commit therefore provides rcu_expedite_gp() to enable expediting
      and rcu_unexpedite_gp() to cancel a prior rcu_expedite_gp(), both of
      which support nesting.
      Reported-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      0d39482c
  14. 26 2月, 2015 1 次提交
  15. 16 1月, 2015 1 次提交
    • P
      rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors · 5cd37193
      Paul E. McKenney 提交于
      Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used
      in places where it would be useful for it to apply to the normal RCU
      flavors, rcu_preempt, rcu_sched, and rcu_bh.  This is especially the
      case for workloads that aggressively overload the system, particularly
      those that generate large numbers of RCU updates on systems running
      NO_HZ_FULL CPUs.  This commit therefore communicates quiescent states
      from cond_resched_rcu_qs() to the normal RCU flavors.
      
      Note that it is unfortunately necessary to leave the old ->passed_quiesce
      mechanism in place to allow quiescent states that apply to only one
      flavor to be recorded.  (Yes, we could decrement ->rcu_qs_ctr_snap in
      that case, but that is not so good for debugging of RCU internals.)
      In addition, if one of the RCU flavor's grace period has stalled, this
      will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight
      quiescent state visible from other CPUs.
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu()
        was used in preemptible code. ]
      5cd37193
  16. 07 1月, 2015 1 次提交
  17. 14 11月, 2014 3 次提交
  18. 04 11月, 2014 2 次提交
  19. 30 10月, 2014 1 次提交
  20. 29 10月, 2014 1 次提交
  21. 17 9月, 2014 2 次提交
  22. 08 9月, 2014 2 次提交
    • P
      rcu: Per-CPU operation cleanups to rcu_*_qs() functions · 284a8c93
      Paul E. McKenney 提交于
      The rcu_bh_qs(), rcu_preempt_qs(), and rcu_sched_qs() functions use
      old-style per-CPU variable access and write to ->passed_quiesce even
      if it is already set.  This commit therefore updates to use the new-style
      per-CPU variable access functions and avoids the spurious writes.
      This commit also eliminates the "cpu" argument to these functions because
      they are always invoked on the indicated CPU.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      284a8c93
    • P
      rcu: Remove redundant preempt_disable() from rcu_note_voluntary_context_switch() · 01a81330
      Paul E. McKenney 提交于
      In theory, synchronize_sched() requires a read-side critical section
      to order against.  In practice, preemption can be thought of as
      being disabled across every machine instruction, at least for those
      machine instructions that are not in the idle loop and not on offline
      CPUs.  So this commit removes the redundant preempt_disable() from
      rcu_note_voluntary_context_switch().
      
      Please note that the single instruction in question is the store of
      zero to ->rcu_tasks_holdout.  The "if" is simply a performance optimization
      that avoids unnecessary stores.  To see this, keep in mind that both
      the "if" condition and the store are in a quiescent state.  Therefore,
      even if the task is preempted for a full grace period (presumably due
      to its having done a context switch beforehand), the store will be
      recording a legitimate quiescent state.
      Reported-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      
      Conflicts:
      	include/linux/rcupdate.h
      01a81330