1. 08 2月, 2013 6 次提交
  2. 24 10月, 2012 3 次提交
  3. 21 8月, 2012 1 次提交
    • T
      workqueue: deprecate system_nrt[_freezable]_wq · 3b07e9ca
      Tejun Heo 提交于
      system_nrt[_freezable]_wq are now spurious.  Mark them deprecated and
      convert all users to system[_freezable]_wq.
      
      If you're cc'd and wondering what's going on: Now all workqueues are
      non-reentrant, so there's no reason to use system_nrt[_freezable]_wq.
      Please use system[_freezable]_wq instead.
      
      This patch doesn't make any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-By: NLai Jiangshan <laijs@cn.fujitsu.com>
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      3b07e9ca
  4. 01 5月, 2012 9 次提交
    • L
      rcu: Implement per-domain single-threaded call_srcu() state machine · 931ea9d1
      Lai Jiangshan 提交于
      This commit implements an SRCU state machine in support of call_srcu().
      The state machine is preemptible, light-weight, and single-threaded,
      minimizing synchronization overhead.  In particular, there is no longer
      any need for synchronize_srcu() to be guarded by a mutex.
      
      Expedited processing is handled, at least in the absence of concurrent
      grace-period operations on that same srcu_struct structure, by having
      the synchronize_srcu_expedited() thread take on the role of the
      workqueue thread for one iteration.
      
      There is a reasonable probability that a given SRCU callback will
      be invoked on the same CPU that registered it, however, there is no
      guarantee.  Concurrent SRCU grace-period primitives can cause callbacks
      to be executed elsewhere, even in absence of CPU-hotplug operations.
      
      Callbacks execute in process context, but under the influence of
      local_bh_disable(), so it is illegal to sleep in an SRCU callback
      function.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      931ea9d1
    • L
      rcu: Use single value to handle expedited SRCU grace periods · d9792edd
      Lai Jiangshan 提交于
      The earlier algorithm used an "expedited" flag combined with a "trycount"
      counter to differentiate between normal and expedited SRCU grace periods.
      However, the difference can be encoded into a single counter with a cutoff
      value and different initial values for expedited and normal SRCU grace
      periods.  This commit makes that change.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      
      Conflicts:
      
      	kernel/srcu.c
      d9792edd
    • L
      rcu: Improve srcu_readers_active_idx()'s cache locality · dc879175
      Lai Jiangshan 提交于
      Expand the calls to srcu_readers_active_idx() from srcu_readers_active()
      inline.  This change improves cache locality by interating over the CPUs
      once rather than twice.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      dc879175
    • L
      rcu: Implement a variant of Peter's SRCU algorithm · b52ce066
      Lai Jiangshan 提交于
      This commit implements a variant of Peter's algorithm, which may be found
      at https://lkml.org/lkml/2012/2/1/119.
      
      o	Make the checking lock-free to enable parallel checking.
      	Parallel checking is required when (1) the original checking
      	task is preempted for a long time, (2) sychronize_srcu_expedited()
      	starts during an ongoing SRCU grace period, or (3) we wish to
      	avoid acquiring a lock.
      
      o	Since the checking is lock-free, we avoid a mutex in state machine
      	for call_srcu().
      
      o	Remove the SRCU_REF_MASK and remove the coupling with the flipping.
      	This might allow us to remove the preempt_disable() in future
      	versions, though such removal will need great care because it
      	rescinds the one-old-reader-per-CPU guarantee.
      
      o	Remove a smp_mb(), simplify the comments and make the smp_mb() pairs
      	more intuitive.
      Inspired-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b52ce066
    • L
      rcu: Improve SRCU's wait_idx() comments · 18108ebf
      Lai Jiangshan 提交于
      The safety of SRCU is provided byy wait_idx() rather than flipping.
      The flipping actually prevents starvation.
      
      This commit therefore updates the comments to more accurately and
      precisely describe what is going on.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      18108ebf
    • L
      rcu: Flip ->completed only once per SRCU grace period · 944ce9af
      Lai Jiangshan 提交于
      This is an optimization of the SRCU grace period.  To guard against
      preempted readers with old values of the counter, it suffices to scan the
      old counters once more, then flip ->completed only one time.  The reason
      this works is that the old readers must have incremented the old set of
      counters (if they have not yet incremented, then their critical section
      starts after this grace period, so they may be safely ignored).
      
      This commit therefore optimizes the second flip out in favor of a simple
      rescan.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      944ce9af
    • L
      rcu: Increment upper bit only for srcu_read_lock() · 440253c1
      Lai Jiangshan 提交于
      The purpose of the upper bit of SRCU's per-CPU counters is to guarantee
      that no reasonable series of srcu_read_lock() and srcu_read_unlock()
      operations can return the value of the counter to its original value.
      This guarantee is require only after the index has been switched to
      the other set of counters, so at most one srcu_read_lock() can affect
      a given CPU's counter.  The number of srcu_read_unlock() operations
      on a given counter is limited to the number of tasks in the system,
      which given the Linux kernel's current structure is limited to far less
      than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems.
      (Something about a limited number of bytes in the kernel's address space.)
      
      Therefore, if srcu_read_lock() increments the upper bits, then
      srcu_read_unlock() need not do so.  In this case, an srcu_read_lock() and
      an srcu_read_unlock() will flip the lower bit of the upper field of the
      counter.  An unreasonably large additional number of srcu_read_unlock()
      operations would be required to return the counter to its initial value,
      thus preserving the guarantee.
      
      This commit takes this approach, which further allows it to shrink
      the size of the upper field to one bit, making the number of
      srcu_read_unlock() operations required to return the counter to its
      initial value even more unreasonable than before.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      440253c1
    • L
      rcu: Remove fast check path from __synchronize_srcu() · 4b7a3e9e
      Lai Jiangshan 提交于
      The fastpath in __synchronize_srcu() is designed to handle cases where
      there are a large number of concurrent calls for the same srcu_struct
      structure.  However, the Linux kernel currently does not use SRCU in
      this manner, so remove the fastpath checks for simplicity.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4b7a3e9e
    • P
      rcu: Direct algorithmic SRCU implementation · cef50120
      Paul E. McKenney 提交于
      The current implementation of synchronize_srcu_expedited() can cause
      severe OS jitter due to its use of synchronize_sched(), which in turn
      invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
      This can result in severe performance degradation for real-time workloads
      and especially for short-interation-length HPC workloads.  Furthermore,
      because only one instance of try_stop_cpus() can be making forward progress
      at a given time, only one instance of synchronize_srcu_expedited() can
      make forward progress at a time, even if they are all operating on
      distinct srcu_struct structures.
      
      This commit, inspired by an earlier implementation by Peter Zijlstra
      (https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
      takes a strictly algorithmic bits-in-memory approach.  This has the
      disadvantage of requiring one explicit memory-barrier instruction in
      each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
      completely dispenses with OS jitter and furthermore allows SRCU to be
      used freely by CPUs that RCU believes to be idle or offline.
      
      The update-side implementation handles the single read-side memory
      barrier by rechecking the per-CPU counters after summing them and
      by running through the update-side state machine twice.
      
      This implementation has passed moderate rcutorture testing on both
      x86 and Power.  Also updated to use this_cpu_ptr() instead of per_cpu_ptr(),
      as suggested by Peter Zijlstra.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      cef50120
  5. 22 2月, 2012 2 次提交
  6. 31 10月, 2011 1 次提交
  7. 14 1月, 2011 1 次提交
  8. 30 11月, 2010 1 次提交
  9. 24 9月, 2010 1 次提交
  10. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  11. 25 2月, 2010 1 次提交
    • P
      rcu: Introduce lockdep-based checking to RCU read-side primitives · 632ee200
      Paul E. McKenney 提交于
      Inspection is proving insufficient to catch all RCU misuses,
      which is understandable given that rcu_dereference() might be
      protected by any of four different flavors of RCU (RCU, RCU-bh,
      RCU-sched, and SRCU), and might also/instead be protected by any
      of a number of locking primitives. It is therefore time to
      enlist the aid of lockdep.
      
      This set of patches is inspired by earlier work by Peter
      Zijlstra and Thomas Gleixner, and takes the following approach:
      
      o	Set up separate lockdep classes for RCU, RCU-bh, and RCU-sched.
      
      o	Set up separate lockdep classes for each instance of SRCU.
      
      o	Create primitives that check for being in an RCU read-side
      	critical section.  These return exact answers if lockdep is
      	fully enabled, but if unsure, report being in an RCU read-side
      	critical section.  (We want to avoid false positives!)
      	The primitives are:
      
      	For RCU: rcu_read_lock_held(void)
      
      	For RCU-bh: rcu_read_lock_bh_held(void)
      
      	For RCU-sched: rcu_read_lock_sched_held(void)
      
      	For SRCU: srcu_read_lock_held(struct srcu_struct *sp)
      
      o	Add rcu_dereference_check(), which takes a second argument
      	in which one places a boolean expression based on the above
      	primitives and/or lockdep_is_held().
      
      o	A new kernel configuration parameter, CONFIG_PROVE_RCU, enables
      	rcu_dereference_check().  This depends on CONFIG_PROVE_LOCKING,
      	and should be quite helpful during the transition period while
      	CONFIG_PROVE_RCU-unaware patches are in flight.
      
      The existing rcu_dereference() primitive does no checking, but
      upcoming patches will change that.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-1-git-send-email-paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      632ee200
  12. 16 1月, 2010 1 次提交
    • P
      rcu: Fix sparse warnings · 017c4261
      Paul E. McKenney 提交于
      Rename local variable "i" in rcu_init() to avoid conflict with
      RCU_INIT_FLAVOR(), restrict the scope of RCU_TREE_NONCORE, and
      make __synchronize_srcu() static.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12635142581560-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      017c4261
  13. 26 10月, 2009 1 次提交
  14. 07 2月, 2008 1 次提交
  15. 04 10月, 2006 2 次提交
    • A
      [PATCH] SRCU: report out-of-memory errors · e6a92013
      Alan Stern 提交于
      Currently the init_srcu_struct() routine has no way to report out-of-memory
      errors.  This patch (as761) makes it return -ENOMEM when the per-cpu data
      allocation fails.
      
      The patch also makes srcu_init_notifier_head() report a BUG if a notifier
      head can't be initialized.  Perhaps it should return -ENOMEM instead, but
      in the most likely cases where this might occur I don't think any recovery
      is possible.  Notifier chains generally are not created dynamically.
      
      [akpm@osdl.org: avoid statement-with-side-effect in macro]
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e6a92013
    • P
      [PATCH] srcu-3: RCU variant permitting read-side blocking · 621934ee
      Paul E. McKenney 提交于
      Updated patch adding a variant of RCU that permits sleeping in read-side
      critical sections.  SRCU is as follows:
      
      o	Each use of SRCU creates its own srcu_struct, and each
      	srcu_struct has its own set of grace periods.  This is
      	critical, as it prevents one subsystem with a blocking
      	reader from holding up SRCU grace periods for other
      	subsystems.
      
      o	The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
      	and synchronize_srcu()) all take a pointer to a srcu_struct.
      
      o	The SRCU primitives must be called from process context.
      
      o	srcu_read_lock() returns an int that must be passed to
      	the matching srcu_read_unlock().  Realtime RCU avoids the
      	need for this by storing the state in the task struct,
      	but SRCU needs to allow a given code path to pass through
      	multiple SRCU domains -- storing state in the task struct
      	would therefore require either arbitrary space in the
      	task struct or arbitrary limits on SRCU nesting.  So I
      	kicked the state-storage problem up to the caller.
      
      	Of course, it is not permitted to call synchronize_srcu()
      	while in an SRCU read-side critical section.
      
      o	There is no call_srcu().  It would not be hard to implement
      	one, but it seems like too easy a way to OOM the system.
      	(Hey, we have enough trouble with call_rcu(), which does
      	-not- permit readers to sleep!!!)  So, if you want it,
      	please tell me why...
      
      [josht@us.ibm.com: sparse notation]
      Signed-off-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      621934ee