1. 03 4月, 2009 2 次提交
    • E
      kmemtrace, rcu: don't include unnecessary headers, allow kmemtrace w/ tracepoints · ac44021f
      Eduard - Gabriel Munteanu 提交于
      Impact: cleanup
      
      linux/percpu.h includes linux/slab.h, which generates circular inclusion
      dependencies when trying to switch kmemtrace to use tracepoints instead
      of markers.
      
      This patch allows tracing within slab headers' inline functions.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: paulmck@linux.vnet.ibm.com
      LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ac44021f
    • I
      kmemtrace, rcu: fix linux/rcutree.h and linux/rcuclassic.h dependencies · b1f77b05
      Ingo Molnar 提交于
      Impact: build fix for all non-x86 architectures
      
      We want to remove percpu.h from rcuclassic.h/rcutree.h (for upcoming
      kmemtrace changes) but that would break the DECLARE_PER_CPU based
      declarations in these files.
      
      Move the quiescent counter management functions to their respective
      RCU implementation .c files - they were slightly above the inlining
      limit anyway.
      
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: paulmck@linux.vnet.ibm.com
      LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b1f77b05
  2. 26 2月, 2009 1 次提交
    • P
      rcu: Teach RCU that idle task is not quiscent state at boot · a6826048
      Paul E. McKenney 提交于
      This patch fixes a bug located by Vegard Nossum with the aid of
      kmemcheck, updated based on review comments from Nick Piggin,
      Ingo Molnar, and Andrew Morton.  And cleans up the variable-name
      and function-name language.  ;-)
      
      The boot CPU runs in the context of its idle thread during boot-up.
      During this time, idle_cpu(0) will always return nonzero, which will
      fool Classic and Hierarchical RCU into deciding that a large chunk of
      the boot-up sequence is a big long quiescent state.  This in turn causes
      RCU to prematurely end grace periods during this time.
      
      This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
      function to ignore the idle task as a quiescent state until the
      system has started up the scheduler in rest_init(), introducing a
      new non-API function rcu_idle_now_means_idle() to inform RCU of this
      transition.  RCU maintains an internal rcu_idle_cpu_truthful variable
      to track this state, which is then used by rcu_check_callback() to
      determine if it should believe idle_cpu().
      
      Because this patch has the effect of disallowing RCU grace periods
      during long stretches of the boot-up sequence, this patch also introduces
      Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
      no-op if num_online_cpus() returns 1.  This allows boot-time code that
      calls synchronize_rcu() to proceed normally.  Note, however, that RCU
      callbacks registered by call_rcu() will likely queue up until later in
      the boot sequence.  Although rcuclassic and rcutree can also use this
      same optimization after boot completes, rcupreempt must restrict its
      use of this optimization to the portion of the boot sequence before the
      scheduler starts up, given that an rcupreempt RCU read-side critical
      section may be preeempted.
      
      In addition, this patch takes Nick Piggin's suggestion to make the
      system_state global variable be __read_mostly.
      
      Changes since v4:
      
      o	Changes the name of the introduced function and variable to
      	be less emotional.  ;-)
      
      Changes since v3:
      
      o	WARN_ON(nr_context_switches() > 0) to verify that RCU
      	switches out of boot-time mode before the first context
      	switch, as suggested by Nick Piggin.
      
      Changes since v2:
      
      o	Created rcu_blocking_is_gp() internal-to-RCU API that
      	determines whether a call to synchronize_rcu() is itself
      	a grace period.
      
      o	The definition of rcu_blocking_is_gp() for rcuclassic and
      	rcutree checks to see if but a single CPU is online.
      
      o	The definition of rcu_blocking_is_gp() for rcupreempt
      	checks to see both if but a single CPU is online and if
      	the system is still in early boot.
      
      	This allows rcupreempt to again work correctly if running
      	on a single CPU after booting is complete.
      
      o	Added check to rcupreempt's synchronize_sched() for there
      	being but one online CPU.
      
      Tested all three variants both SMP and !SMP, booted fine, passed a short
      rcutorture test on both x86 and Power.
      Located-by: NVegard Nossum <vegard.nossum@gmail.com>
      Tested-by: NVegard Nossum <vegard.nossum@gmail.com>
      Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a6826048
  3. 01 1月, 2009 1 次提交
    • R
      cpumask: convert RCU implementations · bd232f97
      Rusty Russell 提交于
      Impact: use new cpumask API.
      
      rcu_ctrlblk contains a cpumask, and it's highly optimized so I don't want
      a cpumask_var_t (ie. a pointer) for the CONFIG_CPUMASK_OFFSTACK case.  It
      could use a dangling bitmap, and be allocated in __rcu_init to save memory,
      but for the moment we use a bitmap.
      
      (Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK,
      so we use a bitmap here to show we really mean it).
      
      We remove on-stack cpumasks, using cpumask_var_t for
      rcu_torture_shuffle_tasks() and for_each_cpu_and in force_quiescent_state().
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      bd232f97
  4. 04 11月, 2008 1 次提交
  5. 03 10月, 2008 1 次提交
    • P
      rcu: RCU-based detection of stalled CPUs for Classic RCU · 2133b5d7
      Paul E. McKenney 提交于
      This patch adds stalled-CPU detection to Classic RCU.  This capability
      is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, which
      defaults disabled.
      
      This is a debugging feature to detect infinite loops in kernel code, not
      something that non-kernel-hackers would be expected to care about.
      
      This feature can detect looping CPUs in !PREEMPT builds and looping CPUs
      with preemption disabled in PREEMPT builds.  This is essentially a port of
      this functionality from the treercu patch, replacing the stall debug patch
      that is already in tip/core/rcu (commit 67182ae1).
      
      The changes from the patch in tip/core/rcu include making the config
      variable name match that in treercu, changing from seconds to jiffies to
      avoid spurious warnings, and printing a boot message when this feature
      is enabled.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2133b5d7
  6. 11 8月, 2008 2 次提交
    • P
      rcu, debug: detect stalled grace periods · 67182ae1
      Paul E. McKenney 提交于
      this is a diagnostic patch for Classic RCU.
      
      The approach is to record a timestamp at the beginning
      of the grace period (in rcu_start_batch()), then have
      rcu_check_callbacks() complain if:
      
       1.	it is running on a CPU that has holding up grace periods for
       	a long time (say one second).  This will identify the culprit
       	assuming that the culprit has not disabled hardware irqs,
       	instruction execution, or some such.
      
       2.	it is running on a CPU that is not holding up grace periods,
       	but grace periods have been held up for an even longer time
       	(say two seconds).
      
      It is enabled via the default-off CONFIG_DEBUG_RCU_STALL kernel parameter.
      
      Rather than exponential backoff, it backs off to once per 30 seconds.
      My feeling upon thinking on it was that if you have stalled RCU grace
      periods for that long, a few extra printk() messages are probably the
      least of your worries...
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Cc: David Witbrodt <dawitbro@sbcglobal.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      67182ae1
    • P
      lockdep: lock protection locks · 7531e2f3
      Peter Zijlstra 提交于
      On Fri, 2008-08-01 at 16:26 -0700, Linus Torvalds wrote:
      
      > On Fri, 1 Aug 2008, David Miller wrote:
      > >
      > > Taking more than a few locks of the same class at once is bad
      > > news and it's better to find an alternative method.
      >
      > It's not always wrong.
      >
      > If you can guarantee that anybody that takes more than one lock of a
      > particular class will always take a single top-level lock _first_, then
      > that's all good. You can obviously screw up and take the same lock _twice_
      > (which will deadlock), but at least you cannot get into ABBA situations.
      >
      > So maybe the right thing to do is to just teach lockdep about "lock
      > protection locks". That would have solved the multi-queue issues for
      > networking too - all the actual network drivers would still have taken
      > just their single queue lock, but the one case that needs to take all of
      > them would have taken a separate top-level lock first.
      >
      > Never mind that the multi-queue locks were always taken in the same order:
      > it's never wrong to just have some top-level serialization, and anybody
      > who needs to take <n> locks might as well do <n+1>, because they sure as
      > hell aren't going to be on _any_ fastpaths.
      >
      > So the simplest solution really sounds like just teaching lockdep about
      > that one special case. It's not "nesting" exactly, although it's obviously
      > related to it.
      
      Do as Linus suggested. The lock protection lock is called nest_lock.
      
      Note that we still have the MAX_LOCK_DEPTH (48) limit to consider, so anything
      that spills that it still up shit creek.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7531e2f3
  7. 18 7月, 2008 2 次提交
    • L
      rcu classic: new algorithm for callbacks-processing(v2) · 5127bed5
      Lai Jiangshan 提交于
      This is v2, it's a little deference from v1 that I
      had send to lkml.
      use ACCESS_ONCE
      use rcu_batch_after/rcu_batch_before for batch # comparison.
      
      rcutorture test result:
      (hotplugs: do cpu-online/offline once per second)
      
      No CONFIG_NO_HZ:           OK, 12hours
      No CONFIG_NO_HZ, hotplugs: OK, 12hours
      CONFIG_NO_HZ=y:            OK, 24hours
      CONFIG_NO_HZ=y, hotplugs:  Failed.
      (Failed also without my patch applied, exactly the same bug occurred,
      http://lkml.org/lkml/2008/7/3/24)
      
      v1's email thread:
      http://lkml.org/lkml/2008/6/2/539
      
      v1's description:
      
      The code/algorithm of the implement of current callbacks-processing
      is very efficient and technical. But when I studied it and I found
      a disadvantage:
      
      In multi-CPU systems, when a new RCU callback is being
      queued(call_rcu[_bh]), this callback will be invoked after the grace
      period for the batch with batch number = rcp->cur+2 has completed
      very very likely in current implement. Actually, this callback can be
      invoked after the grace period for the batch with
      batch number = rcp->cur+1 has completed. The delay of invocation means
      that latency of synchronize_rcu() is extended. But more important thing
      is that the callbacks usually free memory, and these works are delayed
      too! it's necessary for reclaimer to free memory as soon as
      possible when left memory is few.
      
      A very simple way can solve this problem:
      a field(struct rcu_head::batch) is added to record the batch number for
      the RCU callback. And when a new RCU callback is being queued, we
      determine the batch number for this callback(head->batch = rcp->cur+1)
      and we move this callback to rdp->donelist if we find
      that head->batch <= rcp->completed when we process callbacks.
      This simple way reduces the wait time for invocation a lot. (about
      2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems)
      
      This is my algorithm. But I do not add any field for struct rcu_head
      in my implement. We just need to memorize the last 2 batches and
      their batch number, because these 2 batches include all entries that
      for whom the grace period hasn't completed. So we use a special
      linked-list rather than add a field.
      Please see the comment of struct rcu_data.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Gautham Shenoy <ego@in.ibm.com>
      Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5127bed5
    • L
      rcu classic: simplify the next pending batch · 3cac97cb
      Lai Jiangshan 提交于
      use a batch number(rcp->pending) instead of a flag(rcp->next_pending)
      
      rcu_start_batch() need to change this flag, so mb()s is needed
      for memory-access safe.
      
      but(after this patch applied) rcu_start_batch() do not change
      this batch number(rcp->pending), rcp->pending is managed by
      __rcu_process_callbacks only, and troublesome mb()s are eliminated.
      
      And codes look simpler and clearer.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Gautham Shenoy <ego@in.ibm.com>
      Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3cac97cb
  8. 19 5月, 2008 1 次提交
    • P
      rcu: add call_rcu_sched() · 4446a36f
      Paul E. McKenney 提交于
      Fourth cut of patch to provide the call_rcu_sched().  This is again to
      synchronize_sched() as call_rcu() is to synchronize_rcu().
      
      Should be fine for experimental and -rt use, but not ready for inclusion.
      With some luck, I will be able to tell Andrew to come out of hiding on
      the next round.
      
      Passes multi-day rcutorture sessions with concurrent CPU hotplugging.
      
      Fixes since the first version include a bug that could result in
      indefinite blocking (spotted by Gautham Shenoy), better resiliency
      against CPU-hotplug operations, and other minor fixes.
      
      Fixes since the second version include reworking grace-period detection
      to avoid deadlocks that could happen when running concurrently with
      CPU hotplug, adding Mathieu's fix to avoid the softlockup messages,
      as well as Mathieu's fix to allow use earlier in boot.
      
      Fixes since the third version include a wrong-CPU bug spotted by
      Andrew, getting rid of the obsolete synchronize_kernel API that somehow
      snuck back in, merging spin_unlock() and local_irq_restore() in a
      few places, commenting the code that checks for quiescent states based
      on interrupting from user-mode execution or the idle loop, removing
      some inline attributes, and some code-style changes.
      
      Known/suspected shortcomings:
      
      o	I still do not entirely trust the sleep/wakeup logic.  Next step
      	will be to use a private snapshot of the CPU online mask in
      	rcu_sched_grace_period() -- if the CPU wasn't there at the start
      	of the grace period, we don't need to hear from it.  And the
      	bit about accounting for changes in online CPUs inside of
      	rcu_sched_grace_period() is ugly anyway.
      
      o	It might be good for rcu_sched_grace_period() to invoke
      	resched_cpu() when a given CPU wasn't responding quickly,
      	but resched_cpu() is declared static...
      
      This patch also fixes a long-standing bug in the earlier preemptable-RCU
      implementation of synchronize_rcu() that could result in loss of
      concurrent external changes to a task's CPU affinity mask.  I still cannot
      remember who reported this...
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      4446a36f
  9. 30 4月, 2008 1 次提交
  10. 01 3月, 2008 1 次提交
  11. 26 1月, 2008 2 次提交