1. 03 5月, 2012 1 次提交
    • P
      rcu: Make exit_rcu() more precise and consolidate · 9dd8fb16
      Paul E. McKenney 提交于
      When running preemptible RCU, if a task exits in an RCU read-side
      critical section having blocked within that same RCU read-side critical
      section, the task must be removed from the list of tasks blocking a
      grace period (perhaps the current grace period, perhaps the next grace
      period, depending on timing).  The exit() path invokes exit_rcu() to
      do this cleanup.
      
      However, the current implementation of exit_rcu() needlessly does the
      cleanup even if the task did not block within the current RCU read-side
      critical section, which wastes time and needlessly increases the size
      of the state space.  Fix this by only doing the cleanup if the current
      task is actually on the list of tasks blocking some grace period.
      
      While we are at it, consolidate the two identical exit_rcu() functions
      into a single function.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      
      Conflicts:
      
      	kernel/rcupdate.c
      9dd8fb16
  2. 25 4月, 2012 1 次提交
    • P
      rcu: Document why rcu_blocking_is_gp() is safe · 6d813391
      Paul E. McKenney 提交于
      The rcu_blocking_is_gp() function tests to see if there is only one
      online CPU, and if so, synchronize_sched() and friends become no-ops.
      However, for larger systems, num_online_cpus() scans a large vector,
      and might be preempted while doing so.  While preempted, any number
      of CPUs might come online and go offline, potentially resulting in
      num_online_cpus() returning 1 when there never had only been one
      CPU online.  This could result in a too-short RCU grace period, which
      could in turn result in total failure, except that the only way that
      the grace period is too short is if there is an RCU read-side critical
      section spanning it.  For RCU-sched and RCU-bh (which are the only
      cases using rcu_blocking_is_gp()), RCU read-side critical sections
      have either preemption or bh disabled, which prevents CPUs from going
      offline.  This in turn prevents actual failures from occurring.
      
      This commit therefore adds a large block comment to rcu_blocking_is_gp()
      documenting why it is safe.  This commit also moves rcu_blocking_is_gp()
      into kernel/rcutree.c, which should help prevent unwary developers from
      mistaking it for a generally useful function.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      6d813391
  3. 22 2月, 2012 3 次提交
  4. 29 9月, 2011 1 次提交
  5. 06 5月, 2011 2 次提交
  6. 30 11月, 2010 1 次提交
  7. 18 11月, 2010 1 次提交
    • P
      rcu: move TINY_RCU from softirq to kthread · b2c0710c
      Paul E. McKenney 提交于
      If RCU priority boosting is to be meaningful, callback invocation must
      be boosted in addition to preempted RCU readers.  Otherwise, in presence
      of CPU real-time threads, the grace period ends, but the callbacks don't
      get invoked.  If the callbacks don't get invoked, the associated memory
      doesn't get freed, so the system is still subject to OOM.
      
      But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
      moves the callback invocations to a kthread, which can be boosted easily.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b2c0710c
  8. 21 8月, 2010 3 次提交
  9. 20 8月, 2010 1 次提交
    • P
      rcu: Add a TINY_PREEMPT_RCU · a57eb940
      Paul E. McKenney 提交于
      Implement a small-memory-footprint uniprocessor-only implementation of
      preemptible RCU.  This implementation uses but a single blocked-tasks
      list rather than the combinatorial number used per leaf rcu_node by
      TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
      processing.  This version also takes advantage of uniprocessor execution
      to accelerate grace periods in the case where there are no readers.
      
      The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.
      
      This implementation is a step towards having RCU implementation driven
      off of the SMP and PREEMPT kernel configuration variables, which can
      happen once this implementation has accumulated sufficient experience.
      
      Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
      suggested by Steve Rostedt in order to avoid the compiler-reordering
      issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).
      
      As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
      savings compared to CONFIG_TREE_PREEMPT_RCU.  Of course, for non-real-time
      workloads, CONFIG_TINY_RCU is even better.
      
      	CONFIG_TREE_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   6170	    825	     28	   7023	   kernel/rcutree.o
      				   ----
      				   7026    Total
      
      	CONFIG_TINY_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   2081	     81	      8	   2170	   kernel/rcutiny.o
      				   ----
      				   2183    Total
      
      	CONFIG_TINY_RCU (non-preemptible)
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	    719	     25	      0	    744	   kernel/rcutiny.o
      				    ---
      				    757    Total
      Requested-by: NLoïc Minier <loic.minier@canonical.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a57eb940
  10. 11 5月, 2010 3 次提交
    • P
      rcu: slim down rcutiny by removing rcu_scheduler_active and friends · bbad9379
      Paul E. McKenney 提交于
      TINY_RCU does not need rcu_scheduler_active unless CONFIG_DEBUG_LOCK_ALLOC.
      So conditionally compile rcu_scheduler_active in order to slim down
      rcutiny a bit more.  Also gets rid of an EXPORT_SYMBOL_GPL, which is
      responsible for most of the slimming.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bbad9379
    • P
      rcu: refactor RCU's context-switch handling · 25502a6c
      Paul E. McKenney 提交于
      The addition of preemptible RCU to treercu resulted in a bit of
      confusion and inefficiency surrounding the handling of context switches
      for RCU-sched and for RCU-preempt.  For RCU-sched, a context switch
      is a quiescent state, pure and simple, just like it always has been.
      For RCU-preempt, a context switch is in no way a quiescent state, but
      special handling is required when a task blocks in an RCU read-side
      critical section.
      
      However, the callout from the scheduler and the outer loop in ksoftirqd
      still calls something named rcu_sched_qs(), whose name is no longer
      accurate.  Furthermore, when rcu_check_callbacks() notes an RCU-sched
      quiescent state, it ends up unnecessarily (though harmlessly, aside
      from the performance hit) enqueuing the current task if it happens to
      be running in an RCU-preempt read-side critical section.  This not only
      increases the maximum latency of scheduler_tick(), it also needlessly
      increases the overhead of the next outermost rcu_read_unlock() invocation.
      
      This patch addresses this situation by separating the notion of RCU's
      context-switch handling from that of RCU-sched's quiescent states.
      The context-switch handling is covered by rcu_note_context_switch() in
      general and by rcu_preempt_note_context_switch() for preemptible RCU.
      This permits rcu_sched_qs() to handle quiescent states and only quiescent
      states.  It also reduces the maximum latency of scheduler_tick(), though
      probably by much less than a microsecond.  Finally, it means that tasks
      within preemptible-RCU read-side critical sections avoid incurring the
      overhead of queuing unless there really is a context switch.
      Suggested-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      25502a6c
    • P
      rcu: shrink rcutiny by making synchronize_rcu_bh() be inline · da848c47
      Paul E. McKenney 提交于
      Because synchronize_rcu_bh() is identical to synchronize_sched(),
      make the former a static inline invoking the latter, saving the
      overhead of an EXPORT_SYMBOL_GPL() and the duplicate code.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      da848c47
  11. 07 5月, 2010 1 次提交
    • T
      sched: replace migration_thread with cpu_stop · 969c7921
      Tejun Heo 提交于
      Currently migration_thread is serving three purposes - migration
      pusher, context to execute active_load_balance() and forced context
      switcher for expedited RCU synchronize_sched.  All three roles are
      hardcoded into migration_thread() and determining which job is
      scheduled is slightly messy.
      
      This patch kills migration_thread and replaces all three uses with
      cpu_stop.  The three different roles of migration_thread() are
      splitted into three separate cpu_stop callbacks -
      migration_cpu_stop(), active_load_balance_cpu_stop() and
      synchronize_sched_expedited_cpu_stop() - and each use case now simply
      asks cpu_stop to execute the callback as necessary.
      
      synchronize_sched_expedited() was implemented with private
      preallocated resources and custom multi-cpu queueing and waiting
      logic, both of which are provided by cpu_stop.
      synchronize_sched_expedited_count is made atomic and all other shared
      resources along with the mutex are dropped.
      
      synchronize_sched_expedited() also implemented a check to detect cases
      where not all the callback got executed on their assigned cpus and
      fall back to synchronize_sched().  If called with cpu hotplug blocked,
      cpu_stop already guarantees that and the condition cannot happen;
      otherwise, stop_machine() would break.  However, this patch preserves
      the paranoid check using a cpumask to record on which cpus the stopper
      ran so that it can serve as a bisection point if something actually
      goes wrong theree.
      
      Because the internal execution state is no longer visible,
      rcu_expedited_torture_stats() is removed.
      
      This patch also renames cpu_stop threads to from "stopper/%d" to
      "migration/%d".  The names of these threads ultimately don't matter
      and there's no reason to make unnecessary userland visible changes.
      
      With this patch applied, stop_machine() and sched now share the same
      resources.  stop_machine() is faster without wasting any resources and
      sched migration users are much cleaner.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Josh Triplett <josh@freedesktop.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      969c7921
  12. 26 2月, 2010 1 次提交
    • P
      rcu: Make rcu_read_lock_sched_held() take boot time into account · d9f1bb6a
      Paul E. McKenney 提交于
      Before the scheduler starts, all tasks are non-preemptible by
      definition. So, during that time, rcu_read_lock_sched_held()
      needs to always return "true".  This patch makes that be so.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1267135607-7056-2-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9f1bb6a
  13. 13 1月, 2010 1 次提交
  14. 17 12月, 2009 1 次提交
    • F
      sched: Teach might_sleep() about preemptible RCU · 234da7bc
      Frederic Weisbecker 提交于
      In practice, it is harmless to voluntarily sleep in a
      rcu_read_lock() section if we are running under preempt rcu, but
      it is illegal if we build a kernel running non-preemptable rcu.
      
      Currently, might_sleep() doesn't notice sleepable operations
      under rcu_read_lock() sections if we are running under
      preemptable rcu because preempt_count() is left untouched after
      rcu_read_lock() in this case. But we want developers who test
      their changes under such config to notice the "sleeping while
      atomic" issues.
      
      So we add rcu_read_lock_nesting to prempt_count() in
      might_sleep() checks.
      
      [ v2: Handle rcu-tiny ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1260991265-8451-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      234da7bc
  15. 23 11月, 2009 2 次提交
    • P
      rcu: Re-arrange code to reduce #ifdef pain · 6ebb237b
      Paul E. McKenney 提交于
      Remove #ifdefs from kernel/rcupdate.c and
      include/linux/rcupdate.h by moving code to
      include/linux/rcutiny.h, include/linux/rcutree.h, and
      kernel/rcutree.c.
      
      Also remove some definitions that are no longer used.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1258908830885-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ebb237b
    • P
      rcu: Eliminate unneeded function wrapping · 9f680ab4
      Paul E. McKenney 提交于
      The functions rcu_init() is a wrapper for __rcu_init(), and also
      sets up the CPU-hotplug notifier for rcu_barrier_cpu_hotplug().
      But TINY_RCU doesn't need CPU-hotplug notification, and the
      rcu_barrier_cpu_hotplug() is a simple wrapper for
      rcu_cpu_notify().
      
      So push rcu_init() out to kernel/rcutree.c and kernel/rcutiny.c
      and get rid of the wrapper function rcu_barrier_cpu_hotplug().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12589088302320-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f680ab4
  16. 15 10月, 2009 1 次提交
    • P
      rcu: Stopgap fix for synchronize_rcu_expedited() for TREE_PREEMPT_RCU · 019129d5
      Paul E. McKenney 提交于
      For the short term, map synchronize_rcu_expedited() to
      synchronize_rcu() for TREE_PREEMPT_RCU and to
      synchronize_sched_expedited() for TREE_RCU.
      
      Longer term, there needs to be a real expedited grace period for
      TREE_PREEMPT_RCU, but candidate patches to date are considerably
      more complex and intrusive.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      Cc: npiggin@suse.de
      Cc: jens.axboe@oracle.com
      LKML-Reference: <12555405592331-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      019129d5
  17. 24 9月, 2009 3 次提交
    • P
      rcu: Clean up code to address Ingo's checkpatch feedback · 9b2619af
      Paul E. McKenney 提交于
      Move declarations and update storage classes to make checkpatch happy.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12537246441701-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9b2619af
    • P
      rcu: Clean up code based on review feedback from Josh Triplett, part 2 · 1eba8f84
      Paul E. McKenney 提交于
      These issues identified during an old-fashioned face-to-face code
      review extending over many hours.
      
      o	Add comments for tricky parts of code, and correct comments
      	that have passed their sell-by date.
      
      o	Get rid of the vestiges of rcu_init_sched(), which is no
      	longer needed now that PREEMPT_RCU is gone.
      
      o	Move the #include of rcutree_plugin.h to the end of
      	rcutree.c, which means that, rather than having a random
      	collection of forward declarations, the new set of forward
      	declarations document the set of plugins.  The new home for
      	this #include also allows __rcu_init_preempt() to move into
      	rcutree_plugin.h.
      
      o	Fix rcu_preempt_check_callbacks() to be static.
      Suggested-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12537246443924-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Peter Zijlstra <peterz@infradead.org>
      1eba8f84
    • P
      rcu: Clean up code based on review feedback from Josh Triplett · fc2219d4
      Paul E. McKenney 提交于
      These issues identified during an old-fashioned face-to-face code
      review extended over many hours.
      
      o	Bury various forms of the "rsp->completed == rsp->gpnum"
      	comparison into an rcu_gp_in_progress() function, which has
      	the beneficial side-effect of forcing consistent use of
      	ACCESS_ONCE().
      
      o	Replace hand-coded arithmetic with DIV_ROUND_UP().
      
      o	Bury several "!list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x01])"
      	instances into an rcu_preempted_readers() function, as this
      	expression indicates that there are no readers blocked
      	within RCU read-side critical sections blocking the current
      	grace period.  (Though there might well be similar readers
      	blocking the next grace period.)
      
      o	Remove a dangling rcu_restart_cpu() declaration that has
      	been dangling for almost 20 minor releases of the kernel.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12537246442687-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fc2219d4
  18. 19 9月, 2009 1 次提交
    • P
      rcu: Fix whitespace inconsistencies · a71fca58
      Paul E. McKenney 提交于
      Fix a number of whitespace ^Ierrors in the include/linux/rcu*
      and the kernel/rcu* files.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      LKML-Reference: <20090918172819.GA24405@linux.vnet.ibm.com>
      [ did more checkpatch fixlets ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a71fca58
  19. 18 9月, 2009 1 次提交
    • P
      rcu: Fix synchronize_rcu() for TREE_PREEMPT_RCU · 16e30811
      Paul E. McKenney 提交于
      The redirection of synchronize_sched() to synchronize_rcu() was
      appropriate for TREE_RCU, but not for TREE_PREEMPT_RCU.
      
      Fix this by creating an underlying synchronize_sched().  TREE_RCU
      then redirects synchronize_rcu() to synchronize_sched(), while
      TREE_PREEMPT_RCU has its own version of synchronize_rcu().
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      LKML-Reference: <12528585111916-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      16e30811
  20. 23 8月, 2009 5 次提交
    • P
      rcu: Merge preemptable-RCU functionality into hierarchical RCU · f41d911f
      Paul E. McKenney 提交于
      Create a kernel/rcutree_plugin.h file that contains definitions
      for preemptable RCU (or, under the #else branch of the #ifdef,
      empty definitions for the classic non-preemptable semantics).
      These definitions fit into plugins defined in kernel/rcutree.c
      for this purpose.
      
      This variant of preemptable RCU uses a new algorithm whose
      read-side expense is roughly that of classic hierarchical RCU
      under CONFIG_PREEMPT. This new algorithm's update-side expense
      is similar to that of classic hierarchical RCU, and, in absence
      of read-side preemption or blocking, is exactly that of classic
      hierarchical RCU.  Perhaps more important, this new algorithm
      has a much simpler implementation, saving well over 1,000 lines
      of code compared to mainline's implementation of preemptable
      RCU, which will hopefully be retired in favor of this new
      algorithm.
      
      The simplifications are obtained by maintaining per-task
      nesting state for running tasks, and using a simple
      lock-protected algorithm to handle accounting when tasks block
      within RCU read-side critical sections, making use of lessons
      learned while creating numerous user-level RCU implementations
      over the past 18 months.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746134003-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f41d911f
    • P
      rcu: Simplify rcu_pending()/rcu_check_callbacks() API · a157229c
      Paul E. McKenney 提交于
      All calls from outside RCU are of the form:
      
      	if (rcu_pending(cpu))
      		rcu_check_callbacks(cpu, user);
      
      This is silly, instead we put a call to rcu_pending() in
      rcu_check_callbacks(), and then make the outside calls be to
      rcu_check_callbacks().  This cuts down on the code a bit and
      also gives the compiler a better chance of optimizing.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <125097461311-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a157229c
    • P
      rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h · bc33f24b
      Paul E. McKenney 提交于
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746132349-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bc33f24b
    • P
      rcu: Renamings to increase RCU clarity · d6714c22
      Paul E. McKenney 提交于
      Make RCU-sched, RCU-bh, and RCU-preempt be underlying
      implementations, with "RCU" defined in terms of one of the
      three.  Update the outdated rcu_qsctr_inc() names, as these
      functions no longer increment anything.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746132696-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d6714c22
    • P
      rcu: Move private definitions from include/linux/rcutree.h to kernel/rcutree.h · 9f77da9f
      Paul E. McKenney 提交于
      Some information hiding that makes it easier to merge
      preemptability into rcutree without descending into #include
      hell.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <1250974613373-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f77da9f
  21. 03 7月, 2009 1 次提交
    • P
      rcu: Add synchronize_sched_expedited() primitive · 03b042bf
      Paul E. McKenney 提交于
      This adds the synchronize_sched_expedited() primitive that
      implements the "big hammer" expedited RCU grace periods.
      
      This primitive is placed in kernel/sched.c rather than
      kernel/rcupdate.c due to its need to interact closely with the
      migration_thread() kthread.
      
      The idea is to wake up this kthread with req->task set to NULL,
      in response to which the kthread reports the quiescent state
      resulting from the kthread having been scheduled.
      
      Because this patch needs to fallback to the slow versions of
      the primitives in response to some races with CPU onlining and
      offlining, a new synchronize_rcu_bh() primitive is added as
      well.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: akpm@linux-foundation.org
      Cc: torvalds@linux-foundation.org
      Cc: davem@davemloft.net
      Cc: dada1@cosmosbay.com
      Cc: zbr@ioremap.net
      Cc: jeff.chua.linux@gmail.com
      Cc: paulus@samba.org
      Cc: laijs@cn.fujitsu.com
      Cc: jengelh@medozas.de
      Cc: r000n@r000n.net
      Cc: benh@kernel.crashing.org
      Cc: mathieu.desnoyers@polymtl.ca
      LKML-Reference: <12459460982947-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      03b042bf
  22. 14 4月, 2009 2 次提交
    • P
      rcu: Add __rcu_pending tracing to hierarchical RCU · 7ba5c840
      Paul E. McKenney 提交于
      Add tracing to __rcu_pending() to provide information on why RCU
      processing was kicked off.  This is helpful for debugging hierarchical
      RCU, and might also be helpful in learning how hierarchical RCU operates.
      Located-by: NAnton Blanchard <anton@au1.ibm.com>
      Tested-by: NAnton Blanchard <anton@au1.ibm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: anton@samba.org
      Cc: akpm@linux-foundation.org
      Cc: dipankar@in.ibm.com
      Cc: manfred@colorfullife.com
      Cc: cl@linux-foundation.org
      Cc: josht@linux.vnet.ibm.com
      Cc: schamp@sgi.com
      Cc: niv@us.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: ego@in.ibm.com
      Cc: laijs@cn.fujitsu.com
      Cc: rostedt@goodmis.org
      Cc: peterz@infradead.org
      Cc: penberg@cs.helsinki.fi
      Cc: andi@firstfloor.org
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <1239683479943-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7ba5c840
    • P
      rcu: Make hierarchical RCU less IPI-happy · ef631b0c
      Paul E. McKenney 提交于
      This patch fixes a hierarchical-RCU performance bug located by Anton
      Blanchard.  The problem stems from a misguided attempt to provide a
      work-around for jiffies-counter failure.  This work-around uses a per-CPU
      n_rcu_pending counter, which is incremented on each call to rcu_pending(),
      which in turn is called from each scheduling-clock interrupt.  Each CPU
      then treats this counter as a surrogate for the jiffies counter, so
      that if the jiffies counter fails to advance, the per-CPU n_rcu_pending
      counter will cause RCU to invoke force_quiescent_state(), which in turn
      will (among other things) send resched IPIs to CPUs that have thus far
      failed to pass through an RCU quiescent state.
      
      Unfortunately, each CPU resets only its own counter after sending a
      batch of IPIs.  This means that the other CPUs will also (needlessly)
      send -another- round of IPIs, for a full N-squared set of IPIs in the
      worst case every three scheduler-clock ticks until the grace period
      finally ends.  It is not reasonable for a given CPU to reset each and
      every n_rcu_pending for all the other CPUs, so this patch instead simply
      disables the jiffies-counter "training wheels", thus eliminating the
      excessive IPIs.
      
      Note that the jiffies-counter IPIs do not have this problem due to
      the fact that the jiffies counter is global, so that the CPU sending
      the IPIs can easily reset things, thus preventing the other CPUs from
      sending redundant IPIs.
      
      Note also that the n_rcu_pending counter remains, as it will continue to
      be used for tracing.  It may also see use to update the jiffies counter,
      should an appropriate kick-the-jiffies-counter API appear.
      Located-by: NAnton Blanchard <anton@au1.ibm.com>
      Tested-by: NAnton Blanchard <anton@au1.ibm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: anton@samba.org
      Cc: akpm@linux-foundation.org
      Cc: dipankar@in.ibm.com
      Cc: manfred@colorfullife.com
      Cc: cl@linux-foundation.org
      Cc: josht@linux.vnet.ibm.com
      Cc: schamp@sgi.com
      Cc: niv@us.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: ego@in.ibm.com
      Cc: laijs@cn.fujitsu.com
      Cc: rostedt@goodmis.org
      Cc: peterz@infradead.org
      Cc: penberg@cs.helsinki.fi
      Cc: andi@firstfloor.org
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <12396834793575-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef631b0c
  23. 03 4月, 2009 2 次提交
    • E
      kmemtrace, rcu: don't include unnecessary headers, allow kmemtrace w/ tracepoints · ac44021f
      Eduard - Gabriel Munteanu 提交于
      Impact: cleanup
      
      linux/percpu.h includes linux/slab.h, which generates circular inclusion
      dependencies when trying to switch kmemtrace to use tracepoints instead
      of markers.
      
      This patch allows tracing within slab headers' inline functions.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: paulmck@linux.vnet.ibm.com
      LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ac44021f
    • I
      kmemtrace, rcu: fix linux/rcutree.h and linux/rcuclassic.h dependencies · b1f77b05
      Ingo Molnar 提交于
      Impact: build fix for all non-x86 architectures
      
      We want to remove percpu.h from rcuclassic.h/rcutree.h (for upcoming
      kmemtrace changes) but that would break the DECLARE_PER_CPU based
      declarations in these files.
      
      Move the quiescent counter management functions to their respective
      RCU implementation .c files - they were slightly above the inlining
      limit anyway.
      
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: paulmck@linux.vnet.ibm.com
      LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b1f77b05
  24. 26 2月, 2009 1 次提交
    • P
      rcu: Teach RCU that idle task is not quiscent state at boot · a6826048
      Paul E. McKenney 提交于
      This patch fixes a bug located by Vegard Nossum with the aid of
      kmemcheck, updated based on review comments from Nick Piggin,
      Ingo Molnar, and Andrew Morton.  And cleans up the variable-name
      and function-name language.  ;-)
      
      The boot CPU runs in the context of its idle thread during boot-up.
      During this time, idle_cpu(0) will always return nonzero, which will
      fool Classic and Hierarchical RCU into deciding that a large chunk of
      the boot-up sequence is a big long quiescent state.  This in turn causes
      RCU to prematurely end grace periods during this time.
      
      This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
      function to ignore the idle task as a quiescent state until the
      system has started up the scheduler in rest_init(), introducing a
      new non-API function rcu_idle_now_means_idle() to inform RCU of this
      transition.  RCU maintains an internal rcu_idle_cpu_truthful variable
      to track this state, which is then used by rcu_check_callback() to
      determine if it should believe idle_cpu().
      
      Because this patch has the effect of disallowing RCU grace periods
      during long stretches of the boot-up sequence, this patch also introduces
      Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
      no-op if num_online_cpus() returns 1.  This allows boot-time code that
      calls synchronize_rcu() to proceed normally.  Note, however, that RCU
      callbacks registered by call_rcu() will likely queue up until later in
      the boot sequence.  Although rcuclassic and rcutree can also use this
      same optimization after boot completes, rcupreempt must restrict its
      use of this optimization to the portion of the boot sequence before the
      scheduler starts up, given that an rcupreempt RCU read-side critical
      section may be preeempted.
      
      In addition, this patch takes Nick Piggin's suggestion to make the
      system_state global variable be __read_mostly.
      
      Changes since v4:
      
      o	Changes the name of the introduced function and variable to
      	be less emotional.  ;-)
      
      Changes since v3:
      
      o	WARN_ON(nr_context_switches() > 0) to verify that RCU
      	switches out of boot-time mode before the first context
      	switch, as suggested by Nick Piggin.
      
      Changes since v2:
      
      o	Created rcu_blocking_is_gp() internal-to-RCU API that
      	determines whether a call to synchronize_rcu() is itself
      	a grace period.
      
      o	The definition of rcu_blocking_is_gp() for rcuclassic and
      	rcutree checks to see if but a single CPU is online.
      
      o	The definition of rcu_blocking_is_gp() for rcupreempt
      	checks to see both if but a single CPU is online and if
      	the system is still in early boot.
      
      	This allows rcupreempt to again work correctly if running
      	on a single CPU after booting is complete.
      
      o	Added check to rcupreempt's synchronize_sched() for there
      	being but one online CPU.
      
      Tested all three variants both SMP and !SMP, booted fine, passed a short
      rcutorture test on both x86 and Power.
      Located-by: NVegard Nossum <vegard.nossum@gmail.com>
      Tested-by: NVegard Nossum <vegard.nossum@gmail.com>
      Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a6826048