1. 22 2月, 2012 3 次提交
    • P
      rcu: Note that rcu_access_pointer() can be used for teardown · 5e1ee6e1
      Paul E. McKenney 提交于
      There is no convenient expression for rcu_deference_protected()
      when it is used in tearing down multilinked structures following
      a grace period.  For example, suppose that an element containing an
      RCU-protected pointer to a second element is removed from an enclosing
      RCU-protected data structure, then the write-side lock is released,
      and finally synchronize_rcu() is invoked to wait for a grace period.
      Then it is necessary to traverse the pointer in order to free up the
      second element.  But we are not in an RCU read-side critical section
      and we are holding no locks, so the usual rcu_dereference_check() and
      rcu_dereference_protected() primitives are not appropriate.  Neither
      is rcu_dereference_raw(), as it is intended for use in data structures
      where the user defines the locking design (for example, list_head).
      
      So this responsibility is added to rcu_access_pointer()'s list, and
      this commit updates rcu_assign_pointer()'s header comment accordingly.
      Suggested-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      5e1ee6e1
    • P
      rcu: Make rcu_sleep_check() also check rcu_lock_map · 50406b98
      Paul E. McKenney 提交于
      Although it is OK to be preempted in an RCU read-side critical section
      for TREE_PREEMPT_RCU, it is definitely not OK to be preempted, block,
      or might_sleep() within an RCU read-side critical section for TREE_RCU.
      Unfortunately, rcu_might_sleep() currently only checks for RCU-bh and
      RCU-sched read-side critical sections.  This commit therefore makes
      rcu_might_sleep() check for RCU read-side critical sections, but only
      in TREE_RCU builds.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      50406b98
    • P
      rcu: Avoid waking up CPUs having only kfree_rcu() callbacks · 486e2593
      Paul E. McKenney 提交于
      When CONFIG_RCU_FAST_NO_HZ is enabled, RCU will allow a given CPU to
      enter dyntick-idle mode even if it still has RCU callbacks queued.
      RCU avoids system hangs in this case by scheduling a timer for several
      jiffies in the future.  However, if all of the callbacks on that CPU
      are from kfree_rcu(), there is no reason to wake the CPU up, as it is
      not a problem to defer freeing of memory.
      
      This commit therefore tracks the number of callbacks on a given CPU
      that are from kfree_rcu(), and avoids scheduling the timer if all of
      a given CPU's callbacks are from kfree_rcu().
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      486e2593
  2. 12 12月, 2011 6 次提交
    • P
      rcu: Document same-context read-side constraints · 3842a083
      Paul E. McKenney 提交于
      The intent is that a given RCU read-side critical section be confined
      to a single context.  For example, it is illegal to invoke rcu_read_lock()
      in an exception handler and then invoke rcu_read_unlock() from the
      context of the task that received the exception.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3842a083
    • P
      rcu: Remove one layer of abstraction from PROVE_RCU checking · d8ab29f8
      Paul E. McKenney 提交于
      Simplify things a bit by substituting the definitions of the single-line
      rcu_read_acquire(), rcu_read_release(), rcu_read_acquire_bh(),
      rcu_read_release_bh(), rcu_read_acquire_sched(), and
      rcu_read_release_sched() functions at their call points.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      d8ab29f8
    • F
      rcu: Warn when rcu_read_lock() is used in extended quiescent state · 00f49e57
      Frederic Weisbecker 提交于
      We are currently able to detect uses of rcu_dereference_check() inside
      extended quiescent states (such as the RCU-free window in idle).
      But rcu_read_lock() and friends can be used without rcu_dereference(),
      so that the earlier commit checking for use of rcu_dereference() and
      friends while in RCU idle mode miss some error conditions.  This commit
      therefore adds extended quiescent state checking to rcu_read_lock() and
      friends.
      
      Uses of RCU from within RCU-idle mode are totally ignored by
      RCU, hence the importance of these checks.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      00f49e57
    • F
      rcu: Detect illegal rcu dereference in extended quiescent state · e6b80a3b
      Frederic Weisbecker 提交于
      Report that none of the rcu read lock maps are held while in an RCU
      extended quiescent state (the section between rcu_idle_enter()
      and rcu_idle_exit()). This helps detect any use of rcu_dereference()
      and friends from within the section in idle where RCU is not allowed.
      
      This way we can guarantee an extended quiescent window where the CPU
      can be put in dyntick idle mode or can simply aoid to be part of any
      global grace period completion while in the idle loop.
      
      Uses of RCU from such mode are totally ignored by RCU, hence the
      importance of these checks.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      e6b80a3b
    • P
      rcu: Add failure tracing to rcutorture · 91afaf30
      Paul E. McKenney 提交于
      Trace the rcutorture RCU accesses and dump the trace buffer when the
      first failure is detected.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      91afaf30
    • P
      rcu: Track idleness independent of idle tasks · 9b2e4f18
      Paul E. McKenney 提交于
      Earlier versions of RCU used the scheduling-clock tick to detect idleness
      by checking for the idle task, but handled idleness differently for
      CONFIG_NO_HZ=y.  But there are now a number of uses of RCU read-side
      critical sections in the idle task, for example, for tracing.  A more
      fine-grained detection of idleness is therefore required.
      
      This commit presses the old dyntick-idle code into full-time service,
      so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
      always invoked at the beginning of an idle loop iteration.  Similarly,
      rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
      at the end of an idle-loop iteration.  This allows the idle task to
      use RCU everywhere except between consecutive rcu_idle_enter() and
      rcu_idle_exit() calls, in turn allowing architecture maintainers to
      specify exactly where in the idle loop that RCU may be used.
      
      Because some of the userspace upcall uses can result in what looks
      to RCU like half of an interrupt, it is not possible to expect that
      the irq_enter() and irq_exit() hooks will give exact counts.  This
      patch therefore expands the ->dynticks_nesting counter to 64 bits
      and uses two separate bitfields to count process/idle transitions
      and interrupt entry/exit transitions.  It is presumed that userspace
      upcalls do not happen in the idle loop or from usermode execution
      (though usermode might do a system call that results in an upcall).
      The counter is hard-reset on each process/idle transition, which
      avoids the interrupt entry/exit error from accumulating.  Overflow
      is avoided by the 64-bitness of the ->dyntick_nesting counter.
      
      This commit also adds warnings if a non-idle task asks RCU to enter
      idle state (and these checks will need some adjustment before applying
      Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
      In addition, validation of ->dynticks and ->dynticks_nesting is added.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      9b2e4f18
  3. 29 9月, 2011 8 次提交
  4. 10 6月, 2011 1 次提交
  5. 06 5月, 2011 3 次提交
  6. 01 4月, 2011 1 次提交
  7. 18 12月, 2010 1 次提交
    • T
      rcu: increase synchronize_sched_expedited() batching · e27fc964
      Tejun Heo 提交于
      The fix in commit #6a0cc49 requires more than three concurrent instances
      of synchronize_sched_expedited() before batching is possible.  This
      patch uses a ticket-counter-like approach that is also not unrelated to
      Lai Jiangshan's Ring RCU to allow sharing of expedited grace periods even
      when there are only two concurrent instances of synchronize_sched_expedited().
      
      This commit builds on Tejun's original posting, which may be found at
      http://lkml.org/lkml/2010/11/9/204, adding memory barriers, avoiding
      overflow of signed integers (other than via atomic_t), and fixing the
      detection of batching.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e27fc964
  8. 30 11月, 2010 1 次提交
  9. 18 11月, 2010 1 次提交
    • P
      rcu: move TINY_RCU from softirq to kthread · b2c0710c
      Paul E. McKenney 提交于
      If RCU priority boosting is to be meaningful, callback invocation must
      be boosted in addition to preempted RCU readers.  Otherwise, in presence
      of CPU real-time threads, the grace period ends, but the callbacks don't
      get invoked.  If the callbacks don't get invoked, the associated memory
      doesn't get freed, so the system is still subject to OOM.
      
      But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
      moves the callback invocations to a kthread, which can be boosted easily.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b2c0710c
  10. 06 10月, 2010 1 次提交
  11. 24 9月, 2010 1 次提交
    • P
      rcu: only one evaluation of arg in rcu_dereference_check() unless sparse · 53ecfba2
      Paul E. McKenney 提交于
      The current version of the __rcu_access_pointer(), __rcu_dereference_check(),
      and __rcu_dereference_protected() macros evaluate their "p" argument
      three times, not counting typeof()s.  This is bad news if that argument
      contains a side effect.  This commit therefore evaluates this argument
      only once in normal kernel builds.  However, the straightforward approach
      defeats sparse's RCU-pointer checking, so when __CHECKER__ is defined,
      the additional pair of evaluations of the "p" argument are performed in
      order to permit sparse to detect misuse of RCU-protected pointers.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      53ecfba2
  12. 23 9月, 2010 1 次提交
  13. 21 8月, 2010 3 次提交
  14. 20 8月, 2010 5 次提交
    • P
      rcu: Add a TINY_PREEMPT_RCU · a57eb940
      Paul E. McKenney 提交于
      Implement a small-memory-footprint uniprocessor-only implementation of
      preemptible RCU.  This implementation uses but a single blocked-tasks
      list rather than the combinatorial number used per leaf rcu_node by
      TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
      processing.  This version also takes advantage of uniprocessor execution
      to accelerate grace periods in the case where there are no readers.
      
      The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.
      
      This implementation is a step towards having RCU implementation driven
      off of the SMP and PREEMPT kernel configuration variables, which can
      happen once this implementation has accumulated sufficient experience.
      
      Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
      suggested by Steve Rostedt in order to avoid the compiler-reordering
      issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).
      
      As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
      savings compared to CONFIG_TREE_PREEMPT_RCU.  Of course, for non-real-time
      workloads, CONFIG_TINY_RCU is even better.
      
      	CONFIG_TREE_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   6170	    825	     28	   7023	   kernel/rcutree.o
      				   ----
      				   7026    Total
      
      	CONFIG_TINY_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   2081	     81	      8	   2170	   kernel/rcutiny.o
      				   ----
      				   2183    Total
      
      	CONFIG_TINY_RCU (non-preemptible)
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	    719	     25	      0	    744	   kernel/rcutiny.o
      				    ---
      				    757    Total
      Requested-by: NLoïc Minier <loic.minier@canonical.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a57eb940
    • M
      rcu head remove init · 5e8067ad
      Mathieu Desnoyers 提交于
      RCU heads really don't need to be initialized. Their state before call_rcu()
      really does not matter.
      
      We need to keep init/destroy_rcu_head_on_stack() though, since we want
      debugobjects to be able to keep track of these objects.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      CC: akpm@linux-foundation.org
      CC: mingo@elte.hu
      CC: laijs@cn.fujitsu.com
      CC: dipankar@in.ibm.com
      CC: josh@joshtriplett.org
      CC: dvhltc@us.ibm.com
      CC: niv@us.ibm.com
      CC: tglx@linutronix.de
      CC: peterz@infradead.org
      CC: rostedt@goodmis.org
      CC: Valdis.Kletnieks@vt.edu
      CC: dhowells@redhat.com
      CC: eric.dumazet@gmail.com
      CC: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      5e8067ad
    • P
      rcu: improve kerneldoc for rcu_read_lock(), call_rcu(), and synchronize_rcu() · 77d8485a
      Paul E. McKenney 提交于
      Make it explicit that new RCU read-side critical sections that start
      after call_rcu() and synchronize_rcu() start might still be running
      after the end of the relevant grace period.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      77d8485a
    • T
      Add RCU check for find_task_by_vpid(). · 4221a991
      Tetsuo Handa 提交于
      find_task_by_vpid() says "Must be called under rcu_read_lock().". But due to
      commit 3120438a "rcu: Disable lockdep checking in RCU list-traversal primitives",
      we are currently unable to catch "find_task_by_vpid() with tasklist_lock held
      but RCU lock not held" errors due to the RCU-lockdep checks being
      suppressed in the RCU variants of the struct list_head traversals.
      This commit therefore places an explicit check for being in an RCU
      read-side critical section in find_task_by_pid_ns().
      
        ===================================================
        [ INFO: suspicious rcu_dereference_check() usage. ]
        ---------------------------------------------------
        kernel/pid.c:386 invoked rcu_dereference_check() without protection!
      
        other info that might help us debug this:
      
        rcu_scheduler_active = 1, debug_locks = 1
        1 lock held by rc.sysinit/1102:
         #0:  (tasklist_lock){.+.+..}, at: [<c1048340>] sys_setpgid+0x40/0x160
      
        stack backtrace:
        Pid: 1102, comm: rc.sysinit Not tainted 2.6.35-rc3-dirty #1
        Call Trace:
         [<c105e714>] lockdep_rcu_dereference+0x94/0xb0
         [<c104b4cd>] find_task_by_pid_ns+0x6d/0x70
         [<c104b4e8>] find_task_by_vpid+0x18/0x20
         [<c1048347>] sys_setpgid+0x47/0x160
         [<c1002b50>] sysenter_do_call+0x12/0x36
      
      Commit updated to use a new rcu_lockdep_assert() exported API rather than
      the old internal __do_rcu_dereference().
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      4221a991
    • P
      rcu: define __rcu address space modifier for sparse · ca5ecddf
      Paul E. McKenney 提交于
      This commit provides definitions for the __rcu annotation defined earlier.
      This annotation permits sparse to check for correct use of RCU-protected
      pointers.  If a pointer that is annotated with __rcu is accessed
      directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
      or one of their variants), sparse can be made to complain.  To enable
      such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
      kernel configuration option.  Please note that these sparse complaints are
      intended to be a debugging aid, -not- a code-style-enforcement mechanism.
      
      There are special rcu_dereference_protected() and rcu_access_pointer()
      accessors for use when RCU read-side protection is not required, for
      example, when no other CPU has access to the data structure in question
      or while the current CPU hold the update-side lock.
      
      This patch also updates a number of docbook comments that were showing
      their age.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Christopher Li <sparse@chrisli.org>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      ca5ecddf
  15. 15 6月, 2010 2 次提交
    • P
      rcu: add an rcu_dereference_index_check() · f5155b33
      Paul E. McKenney 提交于
      The sparse RCU-pointer checking relies on type magic that dereferences
      the pointer in question.  This does not work if the pointer is in fact
      an array index.  This commit therefore supplies a new RCU API that
      omits the sparse checking to continue to support rcu_dereference()
      on integers.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f5155b33
    • M
      tree/tiny rcu: Add debug RCU head objects · 551d55a9
      Mathieu Desnoyers 提交于
      Helps finding racy users of call_rcu(), which results in hangs because list
      entries are overwritten and/or skipped.
      
      Changelog since v4:
      - Bissectability is now OK
      - Now generate a WARN_ON_ONCE() for non-initialized rcu_head passed to
        call_rcu(). Statically initialized objects are detected with
        object_is_static().
      - Rename rcu_head_init_on_stack to init_rcu_head_on_stack.
      - Remove init_rcu_head() completely.
      
      Changelog since v3:
      - Include comments from Lai Jiangshan
      
      This new patch version is based on the debugobjects with the newly introduced
      "active state" tracker.
      
      Non-initialized entries are all considered as "statically initialized". An
      activation fixup (triggered by call_rcu()) takes care of performing the debug
      object initialization without issuing any warning. Since we cannot increase the
      size of struct rcu_head, I don't see much room to put an identifier for
      statically initialized rcu_head structures. So for now, we have to live without
      "activation without explicit init" detection. But the main purpose of this debug
      option is to detect double-activations (double call_rcu() use of a rcu_head
      before the callback is executed), which is correctly addressed here.
      
      This also detects potential internal RCU callback corruption, which would cause
      the callbacks to be executed twice.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      CC: akpm@linux-foundation.org
      CC: mingo@elte.hu
      CC: laijs@cn.fujitsu.com
      CC: dipankar@in.ibm.com
      CC: josh@joshtriplett.org
      CC: dvhltc@us.ibm.com
      CC: niv@us.ibm.com
      CC: tglx@linutronix.de
      CC: peterz@infradead.org
      CC: rostedt@goodmis.org
      CC: Valdis.Kletnieks@vt.edu
      CC: dhowells@redhat.com
      CC: eric.dumazet@gmail.com
      CC: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      551d55a9
  16. 11 5月, 2010 2 次提交
    • M
      rcu head introduce rcu head init on stack · 4376030a
      Mathieu Desnoyers 提交于
      PEM:
      o     Would it be possible to make this bisectable as follows?
      
            a.      Insert a new patch after current patch 4/6 that
                    defines destroy_rcu_head_on_stack(),
                    init_rcu_head_on_stack(), and init_rcu_head() with
                    their !CONFIG_DEBUG_OBJECTS_RCU_HEAD definitions.
      
      This patch performs this transition.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: akpm@linux-foundation.org
      CC: mingo@elte.hu
      CC: laijs@cn.fujitsu.com
      CC: dipankar@in.ibm.com
      CC: josh@joshtriplett.org
      CC: dvhltc@us.ibm.com
      CC: niv@us.ibm.com
      CC: tglx@linutronix.de
      CC: peterz@infradead.org
      CC: rostedt@goodmis.org
      CC: Valdis.Kletnieks@vt.edu
      CC: dhowells@redhat.com
      CC: eric.dumazet@gmail.com
      CC: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4376030a
    • P
      rcu: slim down rcutiny by removing rcu_scheduler_active and friends · bbad9379
      Paul E. McKenney 提交于
      TINY_RCU does not need rcu_scheduler_active unless CONFIG_DEBUG_LOCK_ALLOC.
      So conditionally compile rcu_scheduler_active in order to slim down
      rcutiny a bit more.  Also gets rid of an EXPORT_SYMBOL_GPL, which is
      responsible for most of the slimming.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bbad9379