1. 23 8月, 2009 3 次提交
    • P
      rcu: Merge preemptable-RCU functionality into hierarchical RCU · f41d911f
      Paul E. McKenney 提交于
      Create a kernel/rcutree_plugin.h file that contains definitions
      for preemptable RCU (or, under the #else branch of the #ifdef,
      empty definitions for the classic non-preemptable semantics).
      These definitions fit into plugins defined in kernel/rcutree.c
      for this purpose.
      
      This variant of preemptable RCU uses a new algorithm whose
      read-side expense is roughly that of classic hierarchical RCU
      under CONFIG_PREEMPT. This new algorithm's update-side expense
      is similar to that of classic hierarchical RCU, and, in absence
      of read-side preemption or blocking, is exactly that of classic
      hierarchical RCU.  Perhaps more important, this new algorithm
      has a much simpler implementation, saving well over 1,000 lines
      of code compared to mainline's implementation of preemptable
      RCU, which will hopefully be retired in favor of this new
      algorithm.
      
      The simplifications are obtained by maintaining per-task
      nesting state for running tasks, and using a simple
      lock-protected algorithm to handle accounting when tasks block
      within RCU read-side critical sections, making use of lessons
      learned while creating numerous user-level RCU implementations
      over the past 18 months.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746134003-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f41d911f
    • P
      rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h · bc33f24b
      Paul E. McKenney 提交于
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746132349-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bc33f24b
    • P
      rcu: Renamings to increase RCU clarity · d6714c22
      Paul E. McKenney 提交于
      Make RCU-sched, RCU-bh, and RCU-preempt be underlying
      implementations, with "RCU" defined in terms of one of the
      three.  Update the outdated rcu_qsctr_inc() names, as these
      functions no longer increment anything.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746132696-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d6714c22
  2. 03 7月, 2009 1 次提交
    • P
      rcu: Add synchronize_sched_expedited() primitive · 03b042bf
      Paul E. McKenney 提交于
      This adds the synchronize_sched_expedited() primitive that
      implements the "big hammer" expedited RCU grace periods.
      
      This primitive is placed in kernel/sched.c rather than
      kernel/rcupdate.c due to its need to interact closely with the
      migration_thread() kthread.
      
      The idea is to wake up this kthread with req->task set to NULL,
      in response to which the kthread reports the quiescent state
      resulting from the kthread having been scheduled.
      
      Because this patch needs to fallback to the slow versions of
      the primitives in response to some races with CPU onlining and
      offlining, a new synchronize_rcu_bh() primitive is added as
      well.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: akpm@linux-foundation.org
      Cc: torvalds@linux-foundation.org
      Cc: davem@davemloft.net
      Cc: dada1@cosmosbay.com
      Cc: zbr@ioremap.net
      Cc: jeff.chua.linux@gmail.com
      Cc: paulus@samba.org
      Cc: laijs@cn.fujitsu.com
      Cc: jengelh@medozas.de
      Cc: r000n@r000n.net
      Cc: benh@kernel.crashing.org
      Cc: mathieu.desnoyers@polymtl.ca
      LKML-Reference: <12459460982947-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      03b042bf
  3. 24 6月, 2009 1 次提交
    • P
      rcu: Remove Classic RCU · c17ef453
      Paul E. McKenney 提交于
      Remove Classic RCU, given that the combination of Tree RCU and
      the proposed Bloatwatch RCU do everything that Classic RCU can
      with fewer bugs.
      
      Tree RCU has been default in x86 builds for almost six months,
      and seems to be quite reliable, so there does not seem to be
      much justification for keeping the Classic RCU code and config
      complexity around anymore.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>
      Cc: akpm@linux-foundation.org
      Cc: niv@us.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: dipankar@in.ibm.com
      Cc: dhowells@redhat.com
      Cc: lethal@linux-sh.org
      Cc: kernel@wantstofly.org
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c17ef453
  4. 03 4月, 2009 1 次提交
  5. 26 2月, 2009 1 次提交
    • P
      rcu: Teach RCU that idle task is not quiscent state at boot · a6826048
      Paul E. McKenney 提交于
      This patch fixes a bug located by Vegard Nossum with the aid of
      kmemcheck, updated based on review comments from Nick Piggin,
      Ingo Molnar, and Andrew Morton.  And cleans up the variable-name
      and function-name language.  ;-)
      
      The boot CPU runs in the context of its idle thread during boot-up.
      During this time, idle_cpu(0) will always return nonzero, which will
      fool Classic and Hierarchical RCU into deciding that a large chunk of
      the boot-up sequence is a big long quiescent state.  This in turn causes
      RCU to prematurely end grace periods during this time.
      
      This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
      function to ignore the idle task as a quiescent state until the
      system has started up the scheduler in rest_init(), introducing a
      new non-API function rcu_idle_now_means_idle() to inform RCU of this
      transition.  RCU maintains an internal rcu_idle_cpu_truthful variable
      to track this state, which is then used by rcu_check_callback() to
      determine if it should believe idle_cpu().
      
      Because this patch has the effect of disallowing RCU grace periods
      during long stretches of the boot-up sequence, this patch also introduces
      Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
      no-op if num_online_cpus() returns 1.  This allows boot-time code that
      calls synchronize_rcu() to proceed normally.  Note, however, that RCU
      callbacks registered by call_rcu() will likely queue up until later in
      the boot sequence.  Although rcuclassic and rcutree can also use this
      same optimization after boot completes, rcupreempt must restrict its
      use of this optimization to the portion of the boot sequence before the
      scheduler starts up, given that an rcupreempt RCU read-side critical
      section may be preeempted.
      
      In addition, this patch takes Nick Piggin's suggestion to make the
      system_state global variable be __read_mostly.
      
      Changes since v4:
      
      o	Changes the name of the introduced function and variable to
      	be less emotional.  ;-)
      
      Changes since v3:
      
      o	WARN_ON(nr_context_switches() > 0) to verify that RCU
      	switches out of boot-time mode before the first context
      	switch, as suggested by Nick Piggin.
      
      Changes since v2:
      
      o	Created rcu_blocking_is_gp() internal-to-RCU API that
      	determines whether a call to synchronize_rcu() is itself
      	a grace period.
      
      o	The definition of rcu_blocking_is_gp() for rcuclassic and
      	rcutree checks to see if but a single CPU is online.
      
      o	The definition of rcu_blocking_is_gp() for rcupreempt
      	checks to see both if but a single CPU is online and if
      	the system is still in early boot.
      
      	This allows rcupreempt to again work correctly if running
      	on a single CPU after booting is complete.
      
      o	Added check to rcupreempt's synchronize_sched() for there
      	being but one online CPU.
      
      Tested all three variants both SMP and !SMP, booted fine, passed a short
      rcutorture test on both x86 and Power.
      Located-by: NVegard Nossum <vegard.nossum@gmail.com>
      Tested-by: NVegard Nossum <vegard.nossum@gmail.com>
      Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a6826048
  6. 05 1月, 2009 1 次提交
  7. 19 12月, 2008 1 次提交
    • P
      "Tree RCU": scalable classic RCU implementation · 64db4cff
      Paul E. McKenney 提交于
      This patch fixes a long-standing performance bug in classic RCU that
      results in massive internal-to-RCU lock contention on systems with
      more than a few hundred CPUs.  Although this patch creates a separate
      flavor of RCU for ease of review and patch maintenance, it is intended
      to replace classic RCU.
      
      This patch still handles stress better than does mainline, so I am still
      calling it ready for inclusion.  This patch is against the -tip tree.
      Nevertheless, experience on an actual 1000+ CPU machine would still be
      most welcome.
      
      Most of the changes noted below were found while creating an rcutiny
      (which should permit ejecting the current rcuclassic) and while doing
      detailed line-by-line documentation.
      
      Updates from v9 (http://lkml.org/lkml/2008/12/2/334):
      
      o	Fixes from remainder of line-by-line code walkthrough,
      	including comment spelling, initialization, undesirable
      	narrowing due to type conversion, removing redundant memory
      	barriers, removing redundant local-variable initialization,
      	and removing redundant local variables.
      
      	I do not believe that any of these fixes address the CPU-hotplug
      	issues that Andi Kleen was seeing, but please do give it a whirl
      	in case the machine is smarter than I am.
      
      	A writeup from the walkthrough may be found at the following
      	URL, in case you are suffering from terminal insomnia or
      	masochism:
      
      	http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf
      
      o	Made rcutree tracing use seq_file, as suggested some time
      	ago by Lai Jiangshan.
      
      o	Added a .csv variant of the rcudata debugfs trace file, to allow
      	people having thousands of CPUs to drop the data into
      	a spreadsheet.	Tested with oocalc and gnumeric.  Updated
      	documentation to suit.
      
      Updates from v8 (http://lkml.org/lkml/2008/11/15/139):
      
      o	Fix a theoretical race between grace-period initialization and
      	force_quiescent_state() that could occur if more than three
      	jiffies were required to carry out the grace-period
      	initialization.  Which it might, if you had enough CPUs.
      
      o	Apply Ingo's printk-standardization patch.
      
      o	Substitute local variables for repeated accesses to global
      	variables.
      
      o	Fix comment misspellings and redundant (but harmless) increments
      	of ->n_rcu_pending (this latter after having explicitly added it).
      
      o	Apply checkpatch fixes.
      
      Updates from v7 (http://lkml.org/lkml/2008/10/10/291):
      
      o	Fixed a number of problems noted by Gautham Shenoy, including
      	the cpu-stall-detection bug that he was having difficulty
      	convincing me was real.  ;-)
      
      o	Changed cpu-stall detection to wait for ten seconds rather than
      	three in order to reduce false positive, as suggested by Ingo
      	Molnar.
      
      o	Produced a design document (http://lwn.net/Articles/305782/).
      	The act of writing this document uncovered a number of both
      	theoretical and "here and now" bugs as noted below.
      
      o	Fix dynticks_nesting accounting confusion, simplify WARN_ON()
      	condition, fix kerneldoc comments, and add memory barriers
      	in dynticks interface functions.
      
      o	Add more data to tracing.
      
      o	Remove unused "rcu_barrier" field from rcu_data structure.
      
      o	Count calls to rcu_pending() from scheduling-clock interrupt
      	to use as a surrogate timebase should jiffies stop counting.
      
      o	Fix a theoretical race between force_quiescent_state() and
      	grace-period initialization.  Yes, initialization does have to
      	go on for some jiffies for this race to occur, but given enough
      	CPUs...
      
      Updates from v6 (http://lkml.org/lkml/2008/9/23/448):
      
      o	Fix a number of checkpatch.pl complaints.
      
      o	Apply review comments from Ingo Molnar and Lai Jiangshan
      	on the stall-detection code.
      
      o	Fix several bugs in !CONFIG_SMP builds.
      
      o	Fix a misspelled config-parameter name so that RCU now announces
      	at boot time if stall detection is configured.
      
      o	Run tests on numerous combinations of configurations parameters,
      	which after the fixes above, now build and run correctly.
      
      Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line):
      
      o	Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a
      	changeset some time ago, and finally got around to retesting
      	this option).
      
      o	Fix some tracing bugs in rcupreempt that caused incorrect
      	totals to be printed.
      
      o	I now test with a more brutal random-selection online/offline
      	script (attached).  Probably more brutal than it needs to be
      	on the people reading it as well, but so it goes.
      
      o	A number of optimizations and usability improvements:
      
      	o	Make rcu_pending() ignore the grace-period timeout when
      		there is no grace period in progress.
      
      	o	Make force_quiescent_state() avoid going for a global
      		lock in the case where there is no grace period in
      		progress.
      
      	o	Rearrange struct fields to improve struct layout.
      
      	o	Make call_rcu() initiate a grace period if RCU was
      		idle, rather than waiting for the next scheduling
      		clock interrupt.
      
      	o	Invoke rcu_irq_enter() and rcu_irq_exit() only when
      		idle, as suggested by Andi Kleen.  I still don't
      		completely trust this change, and might back it out.
      
      	o	Make CONFIG_RCU_TRACE be the single config variable
      		manipulated for all forms of RCU, instead of the prior
      		confusion.
      
      	o	Document tracing files and formats for both rcupreempt
      		and rcutree.
      
      Updates from v4 for those missing v5 given its bad subject line:
      
      o	Separated dynticks interface so that NMIs and irqs call separate
      	functions, greatly simplifying it.  In particular, this code
      	no longer requires a proof of correctness.  ;-)
      
      o	Separated dynticks state out into its own per-CPU structure,
      	avoiding the duplicated accounting.
      
      o	The case where a dynticks-idle CPU runs an irq handler that
      	invokes call_rcu() is now correctly handled, forcing that CPU
      	out of dynticks-idle mode.
      
      o	Review comments have been applied (thank you all!!!).
      	For but one example, fixed the dynticks-ordering issue that
      	Manfred pointed out, saving me much debugging.  ;-)
      
      o	Adjusted rcuclassic and rcupreempt to handle dynticks changes.
      
      Attached is an updated patch to Classic RCU that applies a hierarchy,
      greatly reducing the contention on the top-level lock for large machines.
      This passes 10-hour concurrent rcutorture and online-offline testing on
      128-CPU ppc64 without dynticks enabled, and exposes some timekeeping
      bugs in presence of dynticks (exciting working on a system where
      "sleep 1" hangs until interrupted...), which were fixed in the
      2.6.27 kernel.  It is getting more reliable than mainline by some
      measures, so the next version will be against -tip for inclusion.
      See also Manfred Spraul's recent patches (or his earlier work from
      2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2).
      We will converge onto a common patch in the fullness of time, but are
      currently exploring different regions of the design space.  That said,
      I have already gratefully stolen quite a few of Manfred's ideas.
      
      This patch provides CONFIG_RCU_FANOUT, which controls the bushiness
      of the RCU hierarchy.  Defaults to 32 on 32-bit machines and 64 on
      64-bit machines.  If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT,
      there is no hierarchy.  By default, the RCU initialization code will
      adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA
      architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable
      this balancing, allowing the hierarchy to be exactly aligned to the
      underlying hardware.  Up to two levels of hierarchy are permitted
      (in addition to the root node), allowing up to 16,384 CPUs on 32-bit
      systems and up to 262,144 CPUs on 64-bit systems.  I just know that I
      am going to regret saying this, but this seems more than sufficient
      for the foreseeable future.  (Some architectures might wish to set
      CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs.
      If this becomes a real problem, additional levels can be added, but I
      doubt that it will make a significant difference on real hardware.)
      
      In the common case, a given CPU will manipulate its private rcu_data
      structure and the rcu_node structure that it shares with its immediate
      neighbors.  This can reduce both lock and memory contention by multiple
      orders of magnitude, which should eliminate the need for the strange
      manipulations that are reported to be required when running Linux on
      very large systems.
      
      Some shortcomings:
      
      o	More bugs will probably surface as a result of an ongoing
      	line-by-line code inspection.
      
      	Patches will be provided as required.
      
      o	There are probably hangs, rcutorture failures, &c.  Seems
      	quite stable on a 128-CPU machine, but that is kind of small
      	compared to 4096 CPUs.  However, seems to do better than
      	mainline.
      
      	Patches will be provided as required.
      
      o	The memory footprint of this version is several KB larger
      	than rcuclassic.
      
      	A separate UP-only rcutiny patch will be provided, which will
      	reduce the memory footprint significantly, even compared
      	to the old rcuclassic.  One such patch passes light testing,
      	and has a memory footprint smaller even than rcuclassic.
      	Initial reaction from various embedded guys was "it is not
      	worth it", so am putting it aside.
      
      Credits:
      
      o	Manfred Spraul for ideas, review comments, and bugs spotted,
      	as well as some good friendly competition.  ;-)
      
      o	Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,
      	Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton
      	for reviews and comments.
      
      o	Thomas Gleixner for much-needed help with some timer issues
      	(see patches below).
      
      o	Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos,
      	Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton
      	Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines
      	alive despite my heavy abuse^Wtesting.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      64db4cff
  8. 16 11月, 2008 1 次提交
  9. 30 9月, 2008 1 次提交
  10. 19 5月, 2008 3 次提交
    • P
      rcu: add rcu_barrier_sched() and rcu_barrier_bh() · 70f12f84
      Paul E. McKenney 提交于
      Add rcu_barrier_sched() and rcu_barrier_bh().  With these in place,
      rcutorture no longer gives the occasional oops when repeatedly starting
      and stopping torturing rcu_bh.  Also adds the API needed to flush out
      pre-existing call_rcu_sched() callbacks.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      70f12f84
    • P
      rcu: add call_rcu_sched() · 4446a36f
      Paul E. McKenney 提交于
      Fourth cut of patch to provide the call_rcu_sched().  This is again to
      synchronize_sched() as call_rcu() is to synchronize_rcu().
      
      Should be fine for experimental and -rt use, but not ready for inclusion.
      With some luck, I will be able to tell Andrew to come out of hiding on
      the next round.
      
      Passes multi-day rcutorture sessions with concurrent CPU hotplugging.
      
      Fixes since the first version include a bug that could result in
      indefinite blocking (spotted by Gautham Shenoy), better resiliency
      against CPU-hotplug operations, and other minor fixes.
      
      Fixes since the second version include reworking grace-period detection
      to avoid deadlocks that could happen when running concurrently with
      CPU hotplug, adding Mathieu's fix to avoid the softlockup messages,
      as well as Mathieu's fix to allow use earlier in boot.
      
      Fixes since the third version include a wrong-CPU bug spotted by
      Andrew, getting rid of the obsolete synchronize_kernel API that somehow
      snuck back in, merging spin_unlock() and local_irq_restore() in a
      few places, commenting the code that checks for quiescent states based
      on interrupting from user-mode execution or the idle loop, removing
      some inline attributes, and some code-style changes.
      
      Known/suspected shortcomings:
      
      o	I still do not entirely trust the sleep/wakeup logic.  Next step
      	will be to use a private snapshot of the CPU online mask in
      	rcu_sched_grace_period() -- if the CPU wasn't there at the start
      	of the grace period, we don't need to hear from it.  And the
      	bit about accounting for changes in online CPUs inside of
      	rcu_sched_grace_period() is ugly anyway.
      
      o	It might be good for rcu_sched_grace_period() to invoke
      	resched_cpu() when a given CPU wasn't responding quickly,
      	but resched_cpu() is declared static...
      
      This patch also fixes a long-standing bug in the earlier preemptable-RCU
      implementation of synchronize_rcu() that could result in loss of
      concurrent external changes to a task's CPU affinity mask.  I still cannot
      remember who reported this...
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      4446a36f
    • S
      rcupreempt: remove duplicate prototypes · 8b09dee6
      Steven Rostedt 提交于
      rcu_batches_completed and rcu_patches_completed_bh are both declared
      in rcuclassic.h and rcupreempt.h. This patch removes the extra
      prototypes for them from rcupdate.h.
      
      rcu_batches_completed_bh is defined as a static inline in the rcupreempt.h
      header file. Trying to export this as EXPORT_SYMBOL_GPL causes linking problems
      with the powerpc linker. There's no need to export a static inlined function.
      
      Modules must be compiled with the same type of RCU implementation as the
      kernel they are for.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b09dee6
  11. 11 5月, 2008 1 次提交
  12. 30 4月, 2008 1 次提交
  13. 07 2月, 2008 1 次提交
  14. 26 1月, 2008 2 次提交
  15. 17 10月, 2007 1 次提交
  16. 12 10月, 2007 1 次提交
  17. 12 8月, 2007 1 次提交
  18. 04 10月, 2006 2 次提交
    • J
      [PATCH] RCU: CREDITS and MAINTAINERS · 595182bc
      Josh Triplett 提交于
      Add MAINTAINERS entry for Read-Copy Update (RCU), listing Dipankar Sarma as
      maintainer, and giving the URL for Paul McKenney's RCU site.  Add
      MAINTAINERS entry for rcutorture, listing myself as maintainer.  Add
      CREDITS entries for developers of RCU, RCU variants, and rcutorture.  Use
      Paul McKenney's preferred email address in include/linux/rcupdate.h .
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Cc: Paul McKenney <paulmck@us.ibm.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      595182bc
    • O
      [PATCH] rcu: simplify/improve batch tuning · 20e9751b
      Oleg Nesterov 提交于
      Kill a hard-to-calculate 'rsinterval' boot parameter and per-cpu
      rcu_data.last_rs_qlen.  Instead, it adds adds a flag rcu_ctrlblk.signaled,
      which records the fact that one of CPUs has sent a resched IPI since the
      last rcu_start_batch().
      
      Roughly speaking, we need two rcu_start_batch()s in order to move callbacks
      from ->nxtlist to ->donelist.  This means that when ->qlen exceeds qhimark
      and continues to grow, we should send a resched IPI, and then do it again
      after we gone through a quiescent state.
      
      On the other hand, if it was already sent, we don't need to do it again
      when another CPU detects overflow of the queue.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      20e9751b
  19. 01 7月, 2006 1 次提交
  20. 28 6月, 2006 1 次提交
  21. 23 6月, 2006 1 次提交
  22. 16 5月, 2006 1 次提交
  23. 23 3月, 2006 1 次提交
  24. 09 3月, 2006 1 次提交
    • D
      [PATCH] rcu batch tuning · 21a1ea9e
      Dipankar Sarma 提交于
      This patch adds new tunables for RCU queue and finished batches.  There are
      two types of controls - number of completed RCU updates invoked in a batch
      (blimit) and monitoring for high rate of incoming RCUs on a cpu (qhimark,
      qlowmark).
      
      By default, the per-cpu batch limit is set to a small value.  If the input
      RCU rate exceeds the high watermark, we do two things - force quiescent
      state on all cpus and set the batch limit of the CPU to INTMAX.  Setting
      batch limit to INTMAX forces all finished RCUs to be processed in one shot.
       If we have more than INTMAX RCUs queued up, then we have bigger problems
      anyway.  Once the incoming queued RCUs fall below the low watermark, the
      batch limit is set to the default.
      Signed-off-by: NDipankar Sarma <dipankar@in.ibm.com>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      21a1ea9e
  25. 04 2月, 2006 1 次提交
  26. 11 1月, 2006 1 次提交
  27. 10 1月, 2006 1 次提交
  28. 09 1月, 2006 1 次提交
  29. 13 12月, 2005 1 次提交
    • D
      [PATCH] add rcu_barrier() synchronization point · ab4720ec
      Dipankar Sarma 提交于
      This introduces a new interface - rcu_barrier() which waits until all
      the RCUs queued until this call have been completed.
      
      Reiser4 needs this, because we do more than just freeing memory object
      in our RCU callback: we also remove it from the list hanging off
      super-block.  This means, that before freeing reiser4-specific portion
      of super-block (during umount) we have to wait until all pending RCU
      callbacks are executed.
      
      The only change of reiser4 made to the original patch, is exporting of
      rcu_barrier().
      
      Cc: Hans Reiser <reiser@namesys.com>
      Cc: Vladimir V. Saveliev <vs@namesys.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ab4720ec
  30. 31 10月, 2005 1 次提交
  31. 18 10月, 2005 1 次提交
  32. 10 9月, 2005 1 次提交
    • D
      [PATCH] files: fix rcu initializers · 8b6490e5
      Dipankar Sarma 提交于
      First of a number of files_lock scaability patches.
      
       Here are the x86 numbers -
      
       tiobench on a 4(8)-way (HT) P4 system on ramdisk :
      
                                               (lockfree)
       Test            2.6.10-vanilla  Stdev   2.6.10-fd       Stdev
       -------------------------------------------------------------
       Seqread         1400.8          11.52   1465.4          34.27
       Randread        1594            8.86    2397.2          29.21
       Seqwrite        242.72          3.47    238.46          6.53
       Randwrite       445.74          9.15    446.4           9.75
      
       The performance improvement is very significant.
       We are getting killed by the cacheline bouncing of the files_struct
       lock here. Writes on ramdisk (ext2) seems to vary just too
       much to get any meaningful number.
      
       Also, With Tridge's thread_perf test on a 4(8)-way (HT) P4 xeon system :
      
       2.6.12-rc5-vanilla :
      
       Running test 'readwrite' with 8 tasks
       Threads     0.34 +/- 0.01 seconds
       Processes   0.16 +/- 0.00 seconds
      
       2.6.12-rc5-fd :
      
       Running test 'readwrite' with 8 tasks
       Threads     0.17 +/- 0.02 seconds
       Processes   0.17 +/- 0.02 seconds
      
       I repeated the measurements on ramfs (as opposed to ext2 on ramdisk in
       the earlier measurement) and I got more consistent results from tiobench :
      
       4(8) way xeon P4
       -----------------
                                               (lock-free)
       Test            2.6.12-rc5      Stdev   2.6.12-rc5-fd   Stdev
       -------------------------------------------------------------
       Seqread         1282            18.59   1343.6          26.37
       Randread        1517            7       2415            34.27
       Seqwrite        702.2           5.27    709.46           5.9
       Randwrite       846.86          15.15   919.68          21.4
      
       4-way ppc64
       ------------
                                               (lock-free)
       Test            2.6.12-rc5      Stdev   2.6.12-rc5-fd   Stdev
       -------------------------------------------------------------
       Seqread         1549            91.16   1569.6          47.2
       Randread        1473.6          25.11   1585.4          69.99
       Seqwrite        1096.8          20.03   1136            29.61
       Randwrite       1189.6           4.04   1275.2          32.96
      
       Also running Tridge's thread_perf test on ppc64 :
      
       2.6.12-rc5-vanilla
       --------------------
       Running test 'readwrite' with 4 tasks
       Threads     0.20 +/- 0.02 seconds
       Processes   0.16 +/- 0.01 seconds
      
       2.6.12-rc5-fd
       --------------------
       Running test 'readwrite' with 4 tasks
       Threads     0.18 +/- 0.04 seconds
       Processes   0.16 +/- 0.01 seconds
      
       The benefits are huge (upto ~60%) in some cases on x86 primarily
       due to the atomic operations during acquisition of ->file_lock
       and cache line bouncing in fast path. ppc64 benefits are modest
       due to LL/SC based locking, but still statistically significant.
      
      This patch:
      
      RCU head initilizer no longer needs the head varible name since we don't use
      list.h lists anymore.
      Signed-off-by: NDipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8b6490e5
  33. 01 5月, 2005 1 次提交
    • P
      [PATCH] Deprecate synchronize_kernel, GPL replacement · 9b06e818
      Paul E. McKenney 提交于
      The synchronize_kernel() primitive is used for quite a few different purposes:
      waiting for RCU readers, waiting for NMIs, waiting for interrupts, and so on.
      This makes RCU code harder to read, since synchronize_kernel() might or might
      not have matching rcu_read_lock()s.  This patch creates a new
      synchronize_rcu() that is to be used for RCU readers and a new
      synchronize_sched() that is used for the rest.  These two new primitives
      currently have the same implementation, but this is might well change with
      additional real-time support.  Both new primitives are GPL-only, the old
      primitive is deprecated.
      Signed-off-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9b06e818
  34. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4