1. 01 5月, 2012 5 次提交
    • L
      rcu: Implement per-domain single-threaded call_srcu() state machine · 931ea9d1
      Lai Jiangshan 提交于
      This commit implements an SRCU state machine in support of call_srcu().
      The state machine is preemptible, light-weight, and single-threaded,
      minimizing synchronization overhead.  In particular, there is no longer
      any need for synchronize_srcu() to be guarded by a mutex.
      
      Expedited processing is handled, at least in the absence of concurrent
      grace-period operations on that same srcu_struct structure, by having
      the synchronize_srcu_expedited() thread take on the role of the
      workqueue thread for one iteration.
      
      There is a reasonable probability that a given SRCU callback will
      be invoked on the same CPU that registered it, however, there is no
      guarantee.  Concurrent SRCU grace-period primitives can cause callbacks
      to be executed elsewhere, even in absence of CPU-hotplug operations.
      
      Callbacks execute in process context, but under the influence of
      local_bh_disable(), so it is illegal to sleep in an SRCU callback
      function.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      931ea9d1
    • L
      rcu: Remove unused srcu_barrier() · 966f58c2
      Lai Jiangshan 提交于
      The old srcu_barrier() macro is now unused.  This commit removes it so
      that it may be used for the SRCU flavor of rcu_barrier(), which will in
      turn be needed to allow the upcoming call_srcu() to be used from within
      modules.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      966f58c2
    • L
      rcu: Implement a variant of Peter's SRCU algorithm · b52ce066
      Lai Jiangshan 提交于
      This commit implements a variant of Peter's algorithm, which may be found
      at https://lkml.org/lkml/2012/2/1/119.
      
      o	Make the checking lock-free to enable parallel checking.
      	Parallel checking is required when (1) the original checking
      	task is preempted for a long time, (2) sychronize_srcu_expedited()
      	starts during an ongoing SRCU grace period, or (3) we wish to
      	avoid acquiring a lock.
      
      o	Since the checking is lock-free, we avoid a mutex in state machine
      	for call_srcu().
      
      o	Remove the SRCU_REF_MASK and remove the coupling with the flipping.
      	This might allow us to remove the preempt_disable() in future
      	versions, though such removal will need great care because it
      	rescinds the one-old-reader-per-CPU guarantee.
      
      o	Remove a smp_mb(), simplify the comments and make the smp_mb() pairs
      	more intuitive.
      Inspired-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b52ce066
    • L
      rcu: Increment upper bit only for srcu_read_lock() · 440253c1
      Lai Jiangshan 提交于
      The purpose of the upper bit of SRCU's per-CPU counters is to guarantee
      that no reasonable series of srcu_read_lock() and srcu_read_unlock()
      operations can return the value of the counter to its original value.
      This guarantee is require only after the index has been switched to
      the other set of counters, so at most one srcu_read_lock() can affect
      a given CPU's counter.  The number of srcu_read_unlock() operations
      on a given counter is limited to the number of tasks in the system,
      which given the Linux kernel's current structure is limited to far less
      than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems.
      (Something about a limited number of bytes in the kernel's address space.)
      
      Therefore, if srcu_read_lock() increments the upper bits, then
      srcu_read_unlock() need not do so.  In this case, an srcu_read_lock() and
      an srcu_read_unlock() will flip the lower bit of the upper field of the
      counter.  An unreasonably large additional number of srcu_read_unlock()
      operations would be required to return the counter to its initial value,
      thus preserving the guarantee.
      
      This commit takes this approach, which further allows it to shrink
      the size of the upper field to one bit, making the number of
      srcu_read_unlock() operations required to return the counter to its
      initial value even more unreasonable than before.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      440253c1
    • P
      rcu: Direct algorithmic SRCU implementation · cef50120
      Paul E. McKenney 提交于
      The current implementation of synchronize_srcu_expedited() can cause
      severe OS jitter due to its use of synchronize_sched(), which in turn
      invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
      This can result in severe performance degradation for real-time workloads
      and especially for short-interation-length HPC workloads.  Furthermore,
      because only one instance of try_stop_cpus() can be making forward progress
      at a given time, only one instance of synchronize_srcu_expedited() can
      make forward progress at a time, even if they are all operating on
      distinct srcu_struct structures.
      
      This commit, inspired by an earlier implementation by Peter Zijlstra
      (https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
      takes a strictly algorithmic bits-in-memory approach.  This has the
      disadvantage of requiring one explicit memory-barrier instruction in
      each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
      completely dispenses with OS jitter and furthermore allows SRCU to be
      used freely by CPUs that RCU believes to be idle or offline.
      
      The update-side implementation handles the single read-side memory
      barrier by rechecking the per-CPU counters after summing them and
      by running through the update-side state machine twice.
      
      This implementation has passed moderate rcutorture testing on both
      x86 and Power.  Also updated to use this_cpu_ptr() instead of per_cpu_ptr(),
      as suggested by Peter Zijlstra.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      cef50120
  2. 22 2月, 2012 2 次提交
  3. 12 12月, 2011 4 次提交
  4. 21 8月, 2010 1 次提交
  5. 20 8月, 2010 1 次提交
    • P
      rcu: define __rcu address space modifier for sparse · ca5ecddf
      Paul E. McKenney 提交于
      This commit provides definitions for the __rcu annotation defined earlier.
      This annotation permits sparse to check for correct use of RCU-protected
      pointers.  If a pointer that is annotated with __rcu is accessed
      directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
      or one of their variants), sparse can be made to complain.  To enable
      such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
      kernel configuration option.  Please note that these sparse complaints are
      intended to be a debugging aid, -not- a code-style-enforcement mechanism.
      
      There are special rcu_dereference_protected() and rcu_access_pointer()
      accessors for use when RCU read-side protection is not required, for
      example, when no other CPU has access to the data structure in question
      or while the current CPU hold the update-side lock.
      
      This patch also updates a number of docbook comments that were showing
      their age.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Christopher Li <sparse@chrisli.org>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      ca5ecddf
  6. 11 5月, 2010 2 次提交
  7. 25 2月, 2010 2 次提交
    • P
      rcu: Add lockdep-enabled variants of rcu_dereference() · c26d34a5
      Paul E. McKenney 提交于
      Make rcu_dereference() check for being in an RCU read-side
      critical section, and create rcu_dereference_bh(),
      rcu_dereference_sched(), and srcu_dereference() to check for the
      other flavors of RCU.  Also create rcu_dereference_raw() to
      avoid checking, and make rcu_dereference_check() use
      rcu_dereference_raw().
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-2-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c26d34a5
    • P
      rcu: Introduce lockdep-based checking to RCU read-side primitives · 632ee200
      Paul E. McKenney 提交于
      Inspection is proving insufficient to catch all RCU misuses,
      which is understandable given that rcu_dereference() might be
      protected by any of four different flavors of RCU (RCU, RCU-bh,
      RCU-sched, and SRCU), and might also/instead be protected by any
      of a number of locking primitives. It is therefore time to
      enlist the aid of lockdep.
      
      This set of patches is inspired by earlier work by Peter
      Zijlstra and Thomas Gleixner, and takes the following approach:
      
      o	Set up separate lockdep classes for RCU, RCU-bh, and RCU-sched.
      
      o	Set up separate lockdep classes for each instance of SRCU.
      
      o	Create primitives that check for being in an RCU read-side
      	critical section.  These return exact answers if lockdep is
      	fully enabled, but if unsure, report being in an RCU read-side
      	critical section.  (We want to avoid false positives!)
      	The primitives are:
      
      	For RCU: rcu_read_lock_held(void)
      
      	For RCU-bh: rcu_read_lock_bh_held(void)
      
      	For RCU-sched: rcu_read_lock_sched_held(void)
      
      	For SRCU: srcu_read_lock_held(struct srcu_struct *sp)
      
      o	Add rcu_dereference_check(), which takes a second argument
      	in which one places a boolean expression based on the above
      	primitives and/or lockdep_is_held().
      
      o	A new kernel configuration parameter, CONFIG_PROVE_RCU, enables
      	rcu_dereference_check().  This depends on CONFIG_PROVE_LOCKING,
      	and should be quite helpful during the transition period while
      	CONFIG_PROVE_RCU-unaware patches are in flight.
      
      The existing rcu_dereference() primitive does no checking, but
      upcoming patches will change that.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-1-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      632ee200
  8. 17 2月, 2010 1 次提交
  9. 26 10月, 2009 1 次提交
  10. 04 10月, 2006 3 次提交
    • A
      [PATCH] SRCU: report out-of-memory errors · e6a92013
      Alan Stern 提交于
      Currently the init_srcu_struct() routine has no way to report out-of-memory
      errors.  This patch (as761) makes it return -ENOMEM when the per-cpu data
      allocation fails.
      
      The patch also makes srcu_init_notifier_head() report a BUG if a notifier
      head can't be initialized.  Perhaps it should return -ENOMEM instead, but
      in the most likely cases where this might occur I don't think any recovery
      is possible.  Notifier chains generally are not created dynamically.
      
      [akpm@osdl.org: avoid statement-with-side-effect in macro]
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e6a92013
    • A
      [PATCH] Add SRCU-based notifier chains · eabc0694
      Alan Stern 提交于
      This patch (as751) adds a new type of notifier chain, based on the SRCU
      (Sleepable Read-Copy Update) primitives recently added to the kernel.  An
      SRCU notifier chain is much like a blocking notifier chain, in that it must
      be called in process context and its callout routines are allowed to sleep.
       The difference is that the chain's links are protected by the SRCU
      mechanism rather than by an rw-semaphore, so calling the chain has
      extremely low overhead: no memory barriers and no cache-line bouncing.  On
      the other hand, unregistering from the chain is expensive and the chain
      head requires special runtime initialization (plus cleanup if it is to be
      deallocated).
      
      SRCU notifiers are appropriate for notifiers that will be called very
      frequently and for which unregistration occurs very seldom.  The proposed
      "task notifier" scheme qualifies, as may some of the network notifiers.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Acked-by: NChandra Seetharaman <sekharan@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      eabc0694
    • P
      [PATCH] srcu-3: RCU variant permitting read-side blocking · 621934ee
      Paul E. McKenney 提交于
      Updated patch adding a variant of RCU that permits sleeping in read-side
      critical sections.  SRCU is as follows:
      
      o	Each use of SRCU creates its own srcu_struct, and each
      	srcu_struct has its own set of grace periods.  This is
      	critical, as it prevents one subsystem with a blocking
      	reader from holding up SRCU grace periods for other
      	subsystems.
      
      o	The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
      	and synchronize_srcu()) all take a pointer to a srcu_struct.
      
      o	The SRCU primitives must be called from process context.
      
      o	srcu_read_lock() returns an int that must be passed to
      	the matching srcu_read_unlock().  Realtime RCU avoids the
      	need for this by storing the state in the task struct,
      	but SRCU needs to allow a given code path to pass through
      	multiple SRCU domains -- storing state in the task struct
      	would therefore require either arbitrary space in the
      	task struct or arbitrary limits on SRCU nesting.  So I
      	kicked the state-storage problem up to the caller.
      
      	Of course, it is not permitted to call synchronize_srcu()
      	while in an SRCU read-side critical section.
      
      o	There is no call_srcu().  It would not be hard to implement
      	one, but it seems like too easy a way to OOM the system.
      	(Hey, we have enough trouble with call_rcu(), which does
      	-not- permit readers to sleep!!!)  So, if you want it,
      	please tell me why...
      
      [josht@us.ibm.com: sparse notation]
      Signed-off-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      621934ee