1. 20 8月, 2010 1 次提交
  2. 12 5月, 2010 1 次提交
  3. 11 5月, 2010 5 次提交
    • P
      d822ed10
    • P
      rcu: RCU_FAST_NO_HZ must check RCU dyntick state · 77e38ed3
      Paul E. McKenney 提交于
      The current version of RCU_FAST_NO_HZ reproduces the old CLASSIC_RCU
      dyntick-idle bug, as it fails to detect CPUs that have interrupted
      or NMIed out of dyntick-idle mode.  Fix this by making rcu_needs_cpu()
      check the state in the per-CPU rcu_dynticks variables, thus correctly
      detecting the dyntick-idle state from an RCU perspective.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      77e38ed3
    • P
      rcu: print boot-time console messages if RCU configs out of ordinary · 26845c28
      Paul E. McKenney 提交于
      Print boot-time messages if tracing is enabled, if fanout is set
      to non-default values, if exact fanout is specified, if accelerated
      dyntick-idle grace periods have been enabled, if RCU-lockdep is enabled,
      if rcutorture has been boot-time enabled, if the CPU stall detector has
      been disabled, or if four-level hierarchy has been enabled.
      
      This is all for TREE_RCU and TREE_PREEMPT_RCU.  TINY_RCU will be handled
      separately, if at all.
      Suggested-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      26845c28
    • P
      rcu: refactor RCU's context-switch handling · 25502a6c
      Paul E. McKenney 提交于
      The addition of preemptible RCU to treercu resulted in a bit of
      confusion and inefficiency surrounding the handling of context switches
      for RCU-sched and for RCU-preempt.  For RCU-sched, a context switch
      is a quiescent state, pure and simple, just like it always has been.
      For RCU-preempt, a context switch is in no way a quiescent state, but
      special handling is required when a task blocks in an RCU read-side
      critical section.
      
      However, the callout from the scheduler and the outer loop in ksoftirqd
      still calls something named rcu_sched_qs(), whose name is no longer
      accurate.  Furthermore, when rcu_check_callbacks() notes an RCU-sched
      quiescent state, it ends up unnecessarily (though harmlessly, aside
      from the performance hit) enqueuing the current task if it happens to
      be running in an RCU-preempt read-side critical section.  This not only
      increases the maximum latency of scheduler_tick(), it also needlessly
      increases the overhead of the next outermost rcu_read_unlock() invocation.
      
      This patch addresses this situation by separating the notion of RCU's
      context-switch handling from that of RCU-sched's quiescent states.
      The context-switch handling is covered by rcu_note_context_switch() in
      general and by rcu_preempt_note_context_switch() for preemptible RCU.
      This permits rcu_sched_qs() to handle quiescent states and only quiescent
      states.  It also reduces the maximum latency of scheduler_tick(), though
      probably by much less than a microsecond.  Finally, it means that tasks
      within preemptible-RCU read-side critical sections avoid incurring the
      overhead of queuing unless there really is a context switch.
      Suggested-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      25502a6c
    • L
      rcu: ignore offline CPUs in last non-dyntick-idle CPU check · 5db35673
      Lai Jiangshan 提交于
      Offline CPUs are not in nohz_cpu_mask, but can be ignored when checking
      for the last non-dyntick-idle CPU.  This patch therefore only checks
      online CPUs for not being dyntick idle, allowing fast entry into
      full-system dyntick-idle state even when there are some offline CPUs.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5db35673
  4. 28 2月, 2010 1 次提交
  5. 27 2月, 2010 2 次提交
    • P
      rcu: Fix accelerated GPs for last non-dynticked CPU · 71da8132
      Paul E. McKenney 提交于
      This patch disables irqs across the call to rcu_needs_cpu().  It
      also enforces a hold-off period so that the idle loop doesn't
      softirq itself to death when there are lots of RCU callbacks in
      flight on the last non-dynticked CPU.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1267231138-27856-3-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      71da8132
    • P
      rcu: Fix accelerated grace periods for last non-dynticked CPU · a47cd880
      Paul E. McKenney 提交于
      It is invalid to invoke __rcu_process_callbacks() with irqs
      disabled, so do it indirectly via raise_softirq().  This
      requires a state-machine implementation to cycle through the
      grace-period machinery the required number of times.
      Located-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1267231138-27856-1-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a47cd880
  6. 25 2月, 2010 4 次提交
    • P
      rcu: Add RCU_CPU_STALL_VERBOSE to dump detailed per-task information · 1ed509a2
      Paul E. McKenney 提交于
      When RCU detects a grace-period stall, it currently just prints
      out the PID of any tasks doing the stalling.  This patch adds
      RCU_CPU_STALL_VERBOSE, which enables the more-verbose reporting
      from sched_show_task().
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-21-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1ed509a2
    • P
      rcu: Fix deadlock in TREE_PREEMPT_RCU CPU stall detection · 3acd9eb3
      Paul E. McKenney 提交于
      Under TREE_PREEMPT_RCU, print_other_cpu_stall() invokes
      rcu_print_task_stall() with the root rcu_node structure's ->lock
      held, and rcu_print_task_stall() acquires that same lock for
      self-deadlock. Fix this by removing the lock acquisition from
      rcu_print_task_stall(), and making all callers acquire the lock
      instead.
      Tested-by: NJohn Kacur <jkacur@redhat.com>
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Located-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-19-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3acd9eb3
    • P
      rcu: Convert to raw_spinlocks · 1304afb2
      Paul E. McKenney 提交于
      The spinlocks in rcutree need to be real spinlocks in
      preempt-rt. Convert them to raw_spinlocks.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-18-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1304afb2
    • P
      rcu: Accelerate grace period if last non-dynticked CPU · 8bd93a2c
      Paul E. McKenney 提交于
      Currently, rcu_needs_cpu() simply checks whether the current CPU
      has an outstanding RCU callback, which means that the last CPU
      to go into dyntick-idle mode might wait a few ticks for the
      relevant grace periods to complete.  However, if all the other
      CPUs are in dyntick-idle mode, and if this CPU is in a quiescent
      state (which it is for RCU-bh and RCU-sched any time that we are
      considering going into dyntick-idle mode), then the grace period
      is instantly complete.
      
      This patch therefore repeatedly invokes the RCU grace-period
      machinery in order to force any needed grace periods to complete
      quickly.  It does so a limited number of times in order to
      prevent starvation by an RCU callback function that might pass
      itself to call_rcu().
      
      However, if any CPU other than the current one is not in
      dyntick-idle mode, fall back to simply checking (with fix to bug
      noted by Lai Jiangshan).  Also, take advantage of last
      grace-period forcing, the opportunity to do so noted by Steve
      Rostedt.  And apply simplified #ifdef condition suggested by
      Frederic Weisbecker.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-15-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8bd93a2c
  7. 13 1月, 2010 2 次提交
    • P
      rcu: Add debug check for too many rcu_read_unlock() · cba8244a
      Paul E. McKenney 提交于
      TREE_PREEMPT_RCU maintains an rcu_read_lock_nesting counter in
      the task structure, which happens to be a signed int.  So this
      patch adds a check for this counter being negative at the end of
      __rcu_read_unlock(). This check is under CONFIG_PROVE_LOCKING,
      so can be thought of as being part of lockdep.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12626498423064-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cba8244a
    • P
      rcu: Add force_quiescent_state() testing to rcutorture · bf66f18e
      Paul E. McKenney 提交于
      Add force_quiescent_state() testing to rcutorture, with a
      separate thread that repeatedly invokes force_quiescent_state()
      in bursts. This can greatly increase the probability of
      encountering certain types of race conditions.
      Suggested-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1262646551116-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bf66f18e
  8. 03 12月, 2009 2 次提交
    • P
      rcu: Add expedited grace-period support for preemptible RCU · d9a3da06
      Paul E. McKenney 提交于
      Implement an synchronize_rcu_expedited() for preemptible RCU
      that actually is expedited.  This uses
      synchronize_sched_expedited() to force all threads currently
      running in a preemptible-RCU read-side critical section onto the
      appropriate ->blocked_tasks[] list, then takes a snapshot of all
      of these lists and waits for them to drain.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1259784616158-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9a3da06
    • P
      rcu: Rename "quiet" functions · d3f6bad3
      Paul E. McKenney 提交于
      The number of "quiet" functions has grown recently, and the
      names are no longer very descriptive.  The point of all of these
      functions is to do some portion of the task of reporting a
      quiescent state, so rename them accordingly:
      
      o	cpu_quiet() becomes rcu_report_qs_rdp(), which reports a
      	quiescent state to the per-CPU rcu_data structure.  If this
      	turns out to be a new quiescent state for this grace period,
      	then rcu_report_qs_rnp() will be invoked to propagate the
      	quiescent state up the rcu_node hierarchy.
      
      o	cpu_quiet_msk() becomes rcu_report_qs_rnp(), which reports
      	a quiescent state for a given CPU (or possibly a set of CPUs)
      	up the rcu_node hierarchy.
      
      o	cpu_quiet_msk_finish() becomes rcu_report_qs_rsp(), which
      	reports a full set of quiescent states to the global rcu_state
      	structure.
      
      o	task_quiet() becomes rcu_report_unblock_qs_rnp(), which reports
      	a quiescent state due to a task exiting an RCU read-side critical
      	section that had previously blocked in that same critical section.
      	As indicated by the new name, this type of quiescent state is
      	reported up the rcu_node hierarchy (using rcu_report_qs_rnp()
      	to do so).
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NJosh Triplett <josh@joshtriplett.org>
      Acked-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12597846163698-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d3f6bad3
  9. 23 11月, 2009 2 次提交
    • P
      rcu: Re-arrange code to reduce #ifdef pain · 6ebb237b
      Paul E. McKenney 提交于
      Remove #ifdefs from kernel/rcupdate.c and
      include/linux/rcupdate.h by moving code to
      include/linux/rcutiny.h, include/linux/rcutree.h, and
      kernel/rcutree.c.
      
      Also remove some definitions that are no longer used.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1258908830885-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ebb237b
    • P
      rcu: Fix grace-period-stall bug on large systems with CPU hotplug · b668c9cf
      Paul E. McKenney 提交于
      When the last CPU of a given leaf rcu_node structure goes
      offline, all of the tasks queued on that leaf rcu_node structure
      (due to having blocked in their current RCU read-side critical
      sections) are requeued onto the root rcu_node structure.  This
      requeuing is carried out by rcu_preempt_offline_tasks().
      However, it is possible that these queued tasks are the only
      thing preventing the leaf rcu_node structure from reporting a
      quiescent state up the rcu_node hierarchy.  Unfortunately, the
      old code would fail to do this reporting, resulting in a
      grace-period stall given the following sequence of events:
      
      1.	Kernel built for more than 32 CPUs on 32-bit systems or for more
      	than 64 CPUs on 64-bit systems, so that there is more than one
      	rcu_node structure.  (Or CONFIG_RCU_FANOUT is artificially set
      	to a number smaller than CONFIG_NR_CPUS.)
      
      2.	The kernel is built with CONFIG_TREE_PREEMPT_RCU.
      
      3.	A task running on a CPU associated with a given leaf rcu_node
      	structure blocks while in an RCU read-side critical section
      	-and- that CPU has not yet passed through a quiescent state
      	for the current RCU grace period.  This will cause the task
      	to be queued on the leaf rcu_node's blocked_tasks[] array, in
      	particular, on the element of this array corresponding to the
      	current grace period.
      
      4.	Each of the remaining CPUs corresponding to this same leaf rcu_node
      	structure pass through a quiescent state.  However, the task is
      	still in its RCU read-side critical section, so these quiescent
      	states cannot be reported further up the rcu_node hierarchy.
      	Nevertheless, all bits in the leaf rcu_node structure's ->qsmask
      	field are now zero.
      
      5.	Each of the remaining CPUs go offline.  (The events in step
      	#4 and #5 can happen in any order as long as each CPU passes
      	through a quiescent state before going offline.)
      
      6.	When the last CPU goes offline, __rcu_offline_cpu() will invoke
      	rcu_preempt_offline_tasks(), which will move the task to the
      	root rcu_node structure, but without reporting a quiescent state
      	up the rcu_node hierarchy (and this failure to report a quiescent
      	state is the bug).
      
      	But because this leaf rcu_node structure's ->qsmask field is
      	already zero and its ->block_tasks[] entries are all empty,
      	force_quiescent_state() will skip this rcu_node structure.
      
      	Therefore, grace periods are now hung.
      
      This patch abstracts some code out of rcu_read_unlock_special(),
      calling the result task_quiet() by analogy with cpu_quiet(), and
      invokes task_quiet() from both rcu_read_lock_special() and
      __rcu_offline_cpu().  Invoking task_quiet() from
      __rcu_offline_cpu() reports the quiescent state up the rcu_node
      hierarchy, fixing the bug.  This ends up requiring a separate
      lock_class_key per level of the rcu_node hierarchy, which this
      patch also provides.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12589088301770-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b668c9cf
  10. 12 11月, 2009 1 次提交
  11. 11 11月, 2009 2 次提交
    • P
      rcu: Simplify association of quiescent states with grace periods · c64ac3ce
      Paul E. McKenney 提交于
      The rdp->passed_quiesc_completed fields are used to properly
      associate the recorded quiescent state with a grace period.  It
      is OK to wrongly associate a given quiescent state with a
      preceding grace period, but it is fatal to associate a given
      quiescent state with a grace period that begins after the
      quiescent state occurred.  Grace periods are numbered, and the
      following fields track them:
      
      o	->gpnum is the number of the grace period currently in
      	progress, or the number of the last grace period to
      	complete if no grace period is currently in progress.
      
      o	->completed is the number of the last grace period to
      	have completed.
      
      These two fields are equal if there is no grace period in
      progress, otherwise ->gpnum is one greater than ->completed.
      But the rdp->passed_quiesc_completed field compared against
      ->completed, and if equal, the quiescent state is presumed to
      count against the current grace period.
      
      The earlier code copied rdp->completed to
      rdp->passed_quiesc_completed, which has been made to work, but
      is error-prone.  In contrast, copying one less than rdp->gpnum
      is guaranteed safe, because rdp->gpnum is not incremented until
      after the start of the corresponding grace period. At the end of
      the grace period, when ->completed has incremented, then any
      quiescent periods recorded previously will be discarded.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12578890421011-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c64ac3ce
    • P
      rcu: Remove inline from forward-referenced functions · dbe01350
      Paul E. McKenney 提交于
      Some variants of gcc are reputed to dislike forward references
      to functions declared "inline".  Remove the "inline" keyword
      from such functions.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12578890422402-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dbe01350
  12. 16 10月, 2009 1 次提交
    • P
      rcu: Fix TREE_PREEMPT_RCU CPU_HOTPLUG bad-luck hang · 237c80c5
      Paul E. McKenney 提交于
      If the following sequence of events occurs, then
      TREE_PREEMPT_RCU will hang waiting for a grace period to
      complete, eventually OOMing the system:
      
      o	A TREE_PREEMPT_RCU build of the kernel is booted on a system
      	with more than 64 physical CPUs present (32 on a 32-bit system).
      	Alternatively, a TREE_PREEMPT_RCU build of the kernel is booted
      	with RCU_FANOUT set to a sufficiently small value that the
      	physical CPUs populate two or more leaf rcu_node structures.
      
      o	A task is preempted in an RCU read-side critical section
      	while running on a CPU corresponding to a given leaf rcu_node
      	structure.
      
      o	All CPUs corresponding to this same leaf rcu_node structure
      	record quiescent states for the current grace period.
      
      o	All of these same CPUs go offline (hence the need for enough
      	physical CPUs to populate more than one leaf rcu_node structure).
      	This causes the preempted task to be moved to the root rcu_node
      	structure.
      
      At this point, there is nothing left to cause the quiescent
      state to be propagated up the rcu_node tree, so the current
      grace period never completes.
      
      The simplest fix, especially after considering the deadlock
      possibilities, is to detect this situation when the last CPU is
      offlined, and to set that CPU's ->qsmask bit in its leaf
      rcu_node structure.  This will cause the next invocation of
      force_quiescent_state() to end the grace period.
      
      Without this fix, this hang can be triggered in an hour or so on
      some machines with rcutorture and random CPU onlining/offlining.
      With this fix, these same machines pass a full 10 hours of this
      sort of abuse.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <20091015162614.GA19131@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      237c80c5
  13. 15 10月, 2009 1 次提交
    • P
      rcu: Stopgap fix for synchronize_rcu_expedited() for TREE_PREEMPT_RCU · 019129d5
      Paul E. McKenney 提交于
      For the short term, map synchronize_rcu_expedited() to
      synchronize_rcu() for TREE_PREEMPT_RCU and to
      synchronize_sched_expedited() for TREE_RCU.
      
      Longer term, there needs to be a real expedited grace period for
      TREE_PREEMPT_RCU, but candidate patches to date are considerably
      more complex and intrusive.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      Cc: npiggin@suse.de
      Cc: jens.axboe@oracle.com
      LKML-Reference: <12555405592331-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      019129d5
  14. 07 10月, 2009 1 次提交
    • P
      rcu: Make hot-unplugged CPU relinquish its own RCU callbacks · e74f4c45
      Paul E. McKenney 提交于
      The current interaction between RCU and CPU hotplug requires that
      RCU block in CPU notifiers waiting for callbacks to drain.
      
      This can be greatly simplified by having each CPU relinquish its
      own callbacks, and for both _rcu_barrier() and CPU_DEAD notifiers
      to adopt all callbacks that were previously relinquished.
      
      This change also eliminates the possibility of certain types of
      hangs due to the previous practice of waiting for callbacks to be
      invoked from within CPU notifiers.  If you don't every wait, you
      cannot hang.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1254890898456-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e74f4c45
  15. 06 10月, 2009 1 次提交
    • P
      rcu: Clean up code based on review feedback from Josh Triplett, part 4 · a0b6c9a7
      Paul E. McKenney 提交于
      These issues identified during an old-fashioned face-to-face code
      review extending over many hours.  This group improves an existing
      abstraction and introduces two new ones.  It also fixes an RCU
      stall-warning bug found while making the other changes.
      
      o	Make RCU_INIT_FLAVOR() declare its own variables, removing
      	the need to declare them at each call site.
      
      o	Create an rcu_for_each_leaf() macro that scans the leaf
      	nodes of the rcu_node tree.
      
      o	Create an rcu_for_each_node_breadth_first() macro that does
      	a breadth-first traversal of the rcu_node tree, AKA
      	stepping through the array in index-number order.
      
      o	If all CPUs corresponding to a given leaf rcu_node
      	structure go offline, then any tasks queued on that leaf
      	will be moved to the root rcu_node structure.  Therefore,
      	the stall-warning code must dump out tasks queued on the
      	root rcu_node structure as well as those queued on the leaf
      	rcu_node structures.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12541491934126-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a0b6c9a7
  16. 24 9月, 2009 2 次提交
    • P
      rcu: Clean up code based on review feedback from Josh Triplett, part 2 · 1eba8f84
      Paul E. McKenney 提交于
      These issues identified during an old-fashioned face-to-face code
      review extending over many hours.
      
      o	Add comments for tricky parts of code, and correct comments
      	that have passed their sell-by date.
      
      o	Get rid of the vestiges of rcu_init_sched(), which is no
      	longer needed now that PREEMPT_RCU is gone.
      
      o	Move the #include of rcutree_plugin.h to the end of
      	rcutree.c, which means that, rather than having a random
      	collection of forward declarations, the new set of forward
      	declarations document the set of plugins.  The new home for
      	this #include also allows __rcu_init_preempt() to move into
      	rcutree_plugin.h.
      
      o	Fix rcu_preempt_check_callbacks() to be static.
      Suggested-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12537246443924-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Peter Zijlstra <peterz@infradead.org>
      1eba8f84
    • P
      rcu: Clean up code based on review feedback from Josh Triplett · fc2219d4
      Paul E. McKenney 提交于
      These issues identified during an old-fashioned face-to-face code
      review extended over many hours.
      
      o	Bury various forms of the "rsp->completed == rsp->gpnum"
      	comparison into an rcu_gp_in_progress() function, which has
      	the beneficial side-effect of forcing consistent use of
      	ACCESS_ONCE().
      
      o	Replace hand-coded arithmetic with DIV_ROUND_UP().
      
      o	Bury several "!list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x01])"
      	instances into an rcu_preempted_readers() function, as this
      	expression indicates that there are no readers blocked
      	within RCU read-side critical sections blocking the current
      	grace period.  (Though there might well be similar readers
      	blocking the next grace period.)
      
      o	Remove a dangling rcu_restart_cpu() declaration that has
      	been dangling for almost 20 minor releases of the kernel.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12537246442687-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fc2219d4
  17. 19 9月, 2009 4 次提交
    • P
      rcu: Fix whitespace inconsistencies · a71fca58
      Paul E. McKenney 提交于
      Fix a number of whitespace ^Ierrors in the include/linux/rcu*
      and the kernel/rcu* files.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      LKML-Reference: <20090918172819.GA24405@linux.vnet.ibm.com>
      [ did more checkpatch fixlets ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a71fca58
    • P
      rcu: Fix thinko, actually initialize full tree · 49e29126
      Paul E. McKenney 提交于
      Commit de078d87 ("rcu: Need to update rnp->gpnum if preemptable RCU
      is to be reliable") repeatedly and incorrectly initializes the root
      rcu_node structure's ->gpnum field rather than initializing the
      ->gpnum field of each node in the tree.  Fix this.  Also add an
      additional consistency check to catch this in the future.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      LKML-Reference: <125329262011-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      49e29126
    • P
      rcu: Apply results of code inspection of kernel/rcutree_plugin.h · e7d8842e
      Paul E. McKenney 提交于
      o Drop the calls to cpu_quiet() from the online/offline code.
        These are unnecessary, since force_quiescent_state() will
        clean up, and removing them simplifies the code a bit.
      
      o Add a warning to check that we don't enqueue the same blocked
        task twice onto the ->blocked_tasks[] lists.
      
      o Rework the phase computation in rcu_preempt_note_context_switch()
        to be more readable, as suggested by Josh Triplett.
      
      o Disable irqs to close a race between the scheduling clock
        interrupt and rcu_preempt_note_context_switch() WRT the
        ->rcu_read_unlock_special field.
      
      o Add comments to rnp->lock acquisition and release within
        rcu_read_unlock_special() noting that irqs are already
        disabled.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      LKML-Reference: <12532926201851-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e7d8842e
    • P
      rcu: Add WARN_ON_ONCE() consistency checks covering state transitions · 28ecd580
      Paul E. McKenney 提交于
      o Verify that qsmask bits stay clear through GP
        initialization.
      
      o Verify that cpu_quiet_msk_finish() is never invoked unless
        there actually is an RCU grace period in progress.
      
      o Verify that all internal-node rcu_node structures have empty
        blocked_tasks[] lists.
      
      o Verify that child rcu_node structure's bits remain clear after
        acquiring parent's lock.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      LKML-Reference: <12532926191947-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      28ecd580
  18. 18 9月, 2009 2 次提交
    • P
      rcu: Simplify rcu_read_unlock_special() quiescent-state accounting · c3422bea
      Paul E. McKenney 提交于
      The earlier approach required two scheduling-clock ticks to note an
      preemptable-RCU quiescent state in the situation in which the
      scheduling-clock interrupt is unlucky enough to always interrupt an
      RCU read-side critical section.
      
      With this change, the quiescent state is instead noted by the
      outermost rcu_read_unlock() immediately following the first
      scheduling-clock tick, or, alternatively, by the first subsequent
      context switch.  Therefore, this change also speeds up grace
      periods.
      Suggested-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      LKML-Reference: <12528585111945-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c3422bea
    • P
      rcu: Add debug checks to TREE_PREEMPT_RCU for premature grace periods · b0e165c0
      Paul E. McKenney 提交于
      Check to make sure that there are no blocked tasks for the previous
      grace period while initializing for the next grace period, verify
      that rcu_preempt_qs() is given the correct CPU number and is never
      called for an offline CPU.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      LKML-Reference: <12528585111986-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b0e165c0
  19. 29 8月, 2009 2 次提交
  20. 25 8月, 2009 1 次提交
    • P
      rcu: Add CPU-offline processing for single-node configurations · 33f76148
      Paul E. McKenney 提交于
      Add preemptable-RCU plugin to handle the CPU-offline
      processing.
      
      An additional plugin is forthcoming to handle multinode RCU
      trees, but this current plugin works for configurations up to
      32 CPUs (64 CPUs for 64-bit kernels).
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12511321213336-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      33f76148
  21. 23 8月, 2009 1 次提交
    • P
      rcu: Merge preemptable-RCU functionality into hierarchical RCU · f41d911f
      Paul E. McKenney 提交于
      Create a kernel/rcutree_plugin.h file that contains definitions
      for preemptable RCU (or, under the #else branch of the #ifdef,
      empty definitions for the classic non-preemptable semantics).
      These definitions fit into plugins defined in kernel/rcutree.c
      for this purpose.
      
      This variant of preemptable RCU uses a new algorithm whose
      read-side expense is roughly that of classic hierarchical RCU
      under CONFIG_PREEMPT. This new algorithm's update-side expense
      is similar to that of classic hierarchical RCU, and, in absence
      of read-side preemption or blocking, is exactly that of classic
      hierarchical RCU.  Perhaps more important, this new algorithm
      has a much simpler implementation, saving well over 1,000 lines
      of code compared to mainline's implementation of preemptable
      RCU, which will hopefully be retired in favor of this new
      algorithm.
      
      The simplifications are obtained by maintaining per-task
      nesting state for running tasks, and using a simple
      lock-protected algorithm to handle accounting when tasks block
      within RCU read-side critical sections, making use of lessons
      learned while creating numerous user-level RCU implementations
      over the past 18 months.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746134003-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f41d911f