1. 28 7月, 2010 1 次提交
  2. 23 7月, 2010 1 次提交
  3. 21 5月, 2010 1 次提交
  4. 13 5月, 2010 1 次提交
    • D
      lockup_detector: Combine nmi_watchdog and softlockup detector · 58687acb
      Don Zickus 提交于
      The new nmi_watchdog (which uses the perf event subsystem) is very
      similar in structure to the softlockup detector.  Using Ingo's
      suggestion, I combined the two functionalities into one file:
      kernel/watchdog.c.
      
      Now both the nmi_watchdog (or hardlockup detector) and softlockup
      detector sit on top of the perf event subsystem, which is run every
      60 seconds or so to see if there are any lockups.
      
      To detect hardlockups, cpus not responding to interrupts, I
      implemented an hrtimer that runs 5 times for every perf event
      overflow event.  If that stops counting on a cpu, then the cpu is
      most likely in trouble.
      
      To detect softlockups, tasks not yielding to the scheduler, I used the
      previous kthread idea that now gets kicked every time the hrtimer fires.
      If the kthread isn't being scheduled neither is anyone else and the
      warning is printed to the console.
      
      I tested this on x86_64 and both the softlockup and hardlockup paths
      work.
      
      V2:
      - cleaned up the Kconfig and softlockup combination
      - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
      - seperated out the softlockup case from perf event subsystem
      - re-arranged the enabling/disabling nmi watchdog from proc space
      - added cpumasks for hardlockup failure cases
      - removed fallback to soft events if no PMU exists for hard events
      
      V3:
      - comment cleanups
      - drop support for older softlockup code
      - per_cpu cleanups
      - completely remove software clock base hardlockup detector
      - use per_cpu masking on hard/soft lockup detection
      - #ifdef cleanups
      - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
      - documentation additions
      
      V4:
      - documentation fixes
      - convert per_cpu to __get_cpu_var
      - powerpc compile fixes
      
      V5:
      - split apart warn flags for hard and soft lockups
      
      TODO:
      - figure out how to make an arch-agnostic clock2cycles call
        (if possible) to feed into perf events as a sample period
      
      [fweisbec: merged conflict patch]
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      58687acb
  5. 08 5月, 2010 1 次提交
    • T
      cpu_stop: add dummy implementation for UP · bbf1bb3e
      Tejun Heo 提交于
      When !CONFIG_SMP, cpu_stop functions weren't defined at all which
      could lead to build failures if UP code uses cpu_stop facility.  Add
      dummy cpu_stop implementation for UP.  The waiting variants execute
      the work function directly with preempt disabled and
      stop_one_cpu_nowait() schedules a workqueue work.
      
      Makefile and ifdefs around stop_machine implementation are updated to
      accomodate CONFIG_SMP && !CONFIG_STOP_MACHINE case.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      bbf1bb3e
  6. 07 3月, 2010 1 次提交
    • D
      elf coredump: replace ELF_CORE_EXTRA_* macros by functions · 1fcccbac
      Daisuke HATAYAMA 提交于
      elf_core_dump() and elf_fdpic_core_dump() use #ifdef and the corresponding
      macro for hiding _multiline_ logics in functions.  This patch removes
      #ifdef and replaces ELF_CORE_EXTRA_* by corresponding functions.  For
      architectures not implemeonting ELF_CORE_EXTRA_*, we use weak functions in
      order to reduce a range of modification.
      
      This cleanup is for my next patches, but I think this cleanup itself is
      worth doing regardless of my firnal purpose.
      Signed-off-by: NDaisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1fcccbac
  7. 17 2月, 2010 1 次提交
  8. 11 2月, 2010 1 次提交
    • Y
      x86: Move range related operation to one file · 27811d8c
      Yinghai Lu 提交于
      We have almost the same code for mtrr cleanup and amd_bus checkup, and
      this code  will also be used in replacing bootmem with early_res,
      so try to move them together and reuse it from different parts.
      
      Also rename update_range to subtract_range as that is what the
      function is actually doing.
      
      -v2: update comments as Christoph requested
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <1265793639-15071-4-git-send-email-yinghai@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      27811d8c
  9. 08 2月, 2010 1 次提交
    • D
      nmi_watchdog: Config option to enable new nmi_watchdog · 84e478c6
      Don Zickus 提交于
      These are the bits that enable the new nmi_watchdog and safely
      isolate the old nmi_watchdog.  Only one or the other can run,
      not both at the same time.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: peterz@infradead.org
      LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      84e478c6
  10. 06 1月, 2010 1 次提交
  11. 02 12月, 2009 1 次提交
  12. 23 11月, 2009 1 次提交
  13. 20 11月, 2009 1 次提交
    • D
      SLOW_WORK: Allow the work items to be viewed through a /proc file · 8fba10a4
      David Howells 提交于
      Allow the executing and queued work items to be viewed through a /proc file
      for debugging purposes.  The contents look something like the following:
      
          THR PID   ITEM ADDR        FL MARK  DESC
          === ===== ================ == ===== ==========
            0  3005 ffff880023f52348  a 952ms FSC: OBJ17d3: LOOK
            1  3006 ffff880024e33668  2 160ms FSC: OBJ17e5 OP60d3b: Write1/Store fl=2
            2  3165 ffff8800296dd180  a 424ms FSC: OBJ17e4: LOOK
            3  4089 ffff8800262c8d78  a 212ms FSC: OBJ17ea: CRTN
            4  4090 ffff88002792bed8  2 388ms FSC: OBJ17e8 OP60d36: Write1/Store fl=2
            5  4092 ffff88002a0ef308  2 388ms FSC: OBJ17e7 OP60d2e: Write1/Store fl=2
            6  4094 ffff88002abaf4b8  2 132ms FSC: OBJ17e2 OP60d4e: Write1/Store fl=2
            7  4095 ffff88002bb188e0  a 388ms FSC: OBJ17e9: CRTN
          vsq     - ffff880023d99668  1 308ms FSC: OBJ17e0 OP60f91: Write1/EnQ fl=2
          vsq     - ffff8800295d1740  1 212ms FSC: OBJ16be OP4d4b6: Write1/EnQ fl=2
          vsq     - ffff880025ba3308  1 160ms FSC: OBJ179a OP58dec: Write1/EnQ fl=2
          vsq     - ffff880024ec83e0  1 160ms FSC: OBJ17ae OP599f2: Write1/EnQ fl=2
          vsq     - ffff880026618e00  1 160ms FSC: OBJ17e6 OP60d33: Write1/EnQ fl=2
          vsq     - ffff880025a2a4b8  1 132ms FSC: OBJ16a2 OP4d583: Write1/EnQ fl=2
          vsq     - ffff880023cbe6d8  9 212ms FSC: OBJ17eb: LOOK
          vsq     - ffff880024d37590  9 212ms FSC: OBJ17ec: LOOK
          vsq     - ffff880027746cb0  9 212ms FSC: OBJ17ed: LOOK
          vsq     - ffff880024d37ae8  9 212ms FSC: OBJ17ee: LOOK
          vsq     - ffff880024d37cb0  9 212ms FSC: OBJ17ef: LOOK
          vsq     - ffff880025036550  9 212ms FSC: OBJ17f0: LOOK
          vsq     - ffff8800250368e0  9 212ms FSC: OBJ17f1: LOOK
          vsq     - ffff880025036aa8  9 212ms FSC: OBJ17f2: LOOK
      
      In the 'THR' column, executing items show the thread they're occupying and
      queued threads indicate which queue they're on.  'PID' shows the process ID of
      a slow-work thread that's executing something.  'FL' shows the work item flags.
      'MARK' indicates how long since an item was queued or began executing.  Lastly,
      the 'DESC' column permits the owner of an item to give some information.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      8fba10a4
  14. 06 11月, 2009 1 次提交
  15. 26 10月, 2009 2 次提交
    • P
      rcu: "Tiny RCU", The Bloatwatch Edition · 9b1d82fa
      Paul E. McKenney 提交于
      This patch is a version of RCU designed for !SMP provided for a
      small-footprint RCU implementation.  In particular, the
      implementation of synchronize_rcu() is extremely lightweight and
      high performance. It passes rcutorture testing in each of the
      four relevant configurations (combinations of NO_HZ and PREEMPT)
      on x86.  This saves about 1K bytes compared to old Classic RCU
      (which is no longer in mainline), and more than three kilobytes
      compared to Hierarchical RCU (updated to 2.6.30):
      
      	CONFIG_TREE_RCU:
      
      	   text	   data	    bss	    dec	    filename
      	    183       4       0     187     kernel/rcupdate.o
      	   2783     520      36    3339     kernel/rcutree.o
      				   3526 Total (vs 4565 for v7)
      
      	CONFIG_TREE_PREEMPT_RCU:
      
      	   text	   data	    bss	    dec	    filename
      	    263       4       0     267     kernel/rcupdate.o
      	   4594     776      52    5422     kernel/rcutree.o
      	   			   5689 Total (6155 for v7)
      
      	CONFIG_TINY_RCU:
      
      	   text	   data	    bss	    dec	    filename
      	     96       4       0     100     kernel/rcupdate.o
      	    734      24       0     758     kernel/rcutiny.o
      	    			    858 Total (vs 848 for v7)
      
      The above is for x86.  Your mileage may vary on other platforms.
      Further compression is possible, but is being procrastinated.
      
      Changes from v7 (http://lkml.org/lkml/2009/10/9/388)
      
      o	Apply Lai Jiangshan's review comments (aside from
      might_sleep() 	in synchronize_sched(), which is covered by SMP builds).
      
      o	Fix up expedited primitives.
      
      Changes from v6 (http://lkml.org/lkml/2009/9/23/293).
      
      o	Forward ported to put it into the 2.6.33 stream.
      
      o	Added lockdep support.
      
      o	Make lightweight rcu_barrier.
      
      Changes from v5 (http://lkml.org/lkml/2009/6/23/12).
      
      o	Ported to latest pre-2.6.32 merge window kernel.
      
      	- Renamed rcu_qsctr_inc() to rcu_sched_qs().
      	- Renamed rcu_bh_qsctr_inc() to rcu_bh_qs().
      	- Provided trivial rcu_cpu_notify().
      	- Provided trivial exit_rcu().
      	- Provided trivial rcu_needs_cpu().
      	- Fixed up the rcu_*_enter/exit() functions in linux/hardirq.h.
      
      o	Removed the dependence on EMBEDDED, with a view to making
      	TINY_RCU default for !SMP at some time in the future.
      
      o	Added (trivial) support for expedited grace periods.
      
      Changes from v4 (http://lkml.org/lkml/2009/5/2/91) include:
      
      o	Squeeze the size down a bit further by removing the
      	->completed field from struct rcu_ctrlblk.
      
      o	This permits synchronize_rcu() to become the empty function.
      	Previous concerns about rcutorture were unfounded, as
      	rcutorture correctly handles a constant value from
      	rcu_batches_completed() and rcu_batches_completed_bh().
      
      Changes from v3 (http://lkml.org/lkml/2009/3/29/221) include:
      
      o	Changed rcu_batches_completed(), rcu_batches_completed_bh()
      	rcu_enter_nohz(), rcu_exit_nohz(), rcu_nmi_enter(), and
      	rcu_nmi_exit(), to be static inlines, as suggested by David
      	Howells.  Doing this saves about 100 bytes from rcutiny.o.
      	(The numbers between v3 and this v4 of the patch are not directly
      	comparable, since they are against different versions of Linux.)
      
      Changes from v2 (http://lkml.org/lkml/2009/2/3/333) include:
      
      o	Fix whitespace issues.
      
      o	Change short-circuit "||" operator to instead be "+" in order
      to 	fix performance bug noted by "kraai" on LWN.
      
      		(http://lwn.net/Articles/324348/)
      
      Changes from v1 (http://lkml.org/lkml/2009/1/13/440) include:
      
      o	This version depends on EMBEDDED as well as !SMP, as suggested
      	by Ingo.
      
      o	Updated rcu_needs_cpu() to unconditionally return zero,
      	permitting the CPU to enter dynticks-idle mode at any time.
      	This works because callbacks can be invoked upon entry to
      	dynticks-idle mode.
      
      o	Paul is now OK with this being included, based on a poll at
      the 	Kernel Miniconf at linux.conf.au, where about ten people said
      	that they cared about saving 900 bytes on single-CPU systems.
      
      o	Applies to both mainline and tip/core/rcu.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJosh Triplett <josh@joshtriplett.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: avi@redhat.com
      Cc: mtosatti@redhat.com
      LKML-Reference: <12565226351355-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9b1d82fa
    • A
      x86: Fix user return notifier build · 7a041097
      Avi Kivity 提交于
      When CONFIG_USER_RETURN_NOTIFIER is set, we need to link
      kernel/user-return-notifier.o.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      LKML-Reference: <1256473485-23109-1-git-send-email-avi@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a041097
  16. 24 9月, 2009 1 次提交
  17. 21 9月, 2009 1 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
  18. 19 9月, 2009 1 次提交
  19. 16 9月, 2009 1 次提交
  20. 23 8月, 2009 2 次提交
    • P
      rcu: Remove CONFIG_PREEMPT_RCU · 6b3ef48a
      Paul E. McKenney 提交于
      Now that CONFIG_TREE_PREEMPT_RCU is in place, there is no
      further need for CONFIG_PREEMPT_RCU.  Remove it, along with
      whatever subtle bugs it may (or may not) contain.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <125097461396-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6b3ef48a
    • P
      rcu: Merge preemptable-RCU functionality into hierarchical RCU · f41d911f
      Paul E. McKenney 提交于
      Create a kernel/rcutree_plugin.h file that contains definitions
      for preemptable RCU (or, under the #else branch of the #ifdef,
      empty definitions for the classic non-preemptable semantics).
      These definitions fit into plugins defined in kernel/rcutree.c
      for this purpose.
      
      This variant of preemptable RCU uses a new algorithm whose
      read-side expense is roughly that of classic hierarchical RCU
      under CONFIG_PREEMPT. This new algorithm's update-side expense
      is similar to that of classic hierarchical RCU, and, in absence
      of read-side preemption or blocking, is exactly that of classic
      hierarchical RCU.  Perhaps more important, this new algorithm
      has a much simpler implementation, saving well over 1,000 lines
      of code compared to mainline's implementation of preemptable
      RCU, which will hopefully be retired in favor of this new
      algorithm.
      
      The simplifications are obtained by maintaining per-task
      nesting state for running tasks, and using a simple
      lock-protected algorithm to handle accounting when tasks block
      within RCU read-side critical sections, making use of lessons
      learned while creating numerous user-level RCU implementations
      over the past 18 months.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746134003-git-send-email->
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f41d911f
  21. 19 8月, 2009 1 次提交
  22. 25 6月, 2009 2 次提交
  23. 24 6月, 2009 2 次提交
    • P
      rcu: Remove Classic RCU · c17ef453
      Paul E. McKenney 提交于
      Remove Classic RCU, given that the combination of Tree RCU and
      the proposed Bloatwatch RCU do everything that Classic RCU can
      with fewer bugs.
      
      Tree RCU has been default in x86 builds for almost six months,
      and seems to be quite reliable, so there does not seem to be
      much justification for keeping the Classic RCU code and config
      complexity around anymore.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>
      Cc: akpm@linux-foundation.org
      Cc: niv@us.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: dipankar@in.ibm.com
      Cc: dhowells@redhat.com
      Cc: lethal@linux-sh.org
      Cc: kernel@wantstofly.org
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c17ef453
    • E
      audit: seperate audit inode watches into a subfile · cfcad62c
      Eric Paris 提交于
      In preparation for converting audit to use fsnotify instead of inotify we
      seperate the inode watching code into it's own file.  This is similar to
      how the audit tree watching code is already seperated into audit_tree.c
      Signed-off-by: NEric Paris <eparis@redhat.com>
      cfcad62c
  24. 19 6月, 2009 1 次提交
    • P
      gcov: add gcov profiling infrastructure · 2521f2c2
      Peter Oberparleiter 提交于
      Enable the use of GCC's coverage testing tool gcov [1] with the Linux
      kernel.  gcov may be useful for:
      
       * debugging (has this code been reached at all?)
       * test improvement (how do I change my test to cover these lines?)
       * minimizing kernel configurations (do I need this option if the
         associated code is never run?)
      
      The profiling patch incorporates the following changes:
      
       * change kbuild to include profiling flags
       * provide functions needed by profiling code
       * present profiling data as files in debugfs
      
      Note that on some architectures, enabling gcc's profiling option
      "-fprofile-arcs" for the entire kernel may trigger compile/link/
      run-time problems, some of which are caused by toolchain bugs and
      others which require adjustment of architecture code.
      
      For this reason profiling the entire kernel is initially restricted
      to those architectures for which it is known to work without changes.
      This restriction can be lifted once an architecture has been tested
      and found compatible with gcc's profiling. Profiling of single files
      or directories is still available on all platforms (see config help
      text).
      
      [1] http://gcc.gnu.org/onlinedocs/gcc/Gcov.htmlSigned-off-by: NPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Li Wei <W.Li@Sun.COM>
      Cc: Michael Ellerman <michaele@au1.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: WANG Cong <xiyou.wangcong@gmail.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2521f2c2
  25. 17 6月, 2009 1 次提交
  26. 03 6月, 2009 1 次提交
  27. 15 4月, 2009 1 次提交
    • I
      tracing: make the trace clocks available generally · 56449f43
      Ingo Molnar 提交于
      Jeremy Fitzhardinge reported this build failure:
      
       LD	 .tmp_vmlinux1
       arch/x86/kernel/built-in.o: In function `ds_take_timestamp':
       git/linux/arch/x86/kernel/ds.c:1380: undefined reference to `trace_clock_global'
       git/linux/arch/x86/kernel/ds.c:1380: undefined reference to `trace_clock_global'
      
      Which is due to !CONFIG_TRACING && CONFIG_X86_DS=y.
      
      Expose the trace clock code to CONFIG_X86_DS as well.
      
      [ Unfortunately librarizing doesnt work well - ancient architectures
        with no raw_local_irq_save() primitive break the build. ]
      Reported-by: NJeremy Fitzhardinge <jeremy@goop.org>
      LKML-Reference: <49E4413F.7070700@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      56449f43
  28. 03 4月, 2009 1 次提交
  29. 22 2月, 2009 1 次提交
  30. 19 2月, 2009 1 次提交
  31. 16 1月, 2009 1 次提交
  32. 15 1月, 2009 1 次提交
  33. 11 1月, 2009 1 次提交
  34. 08 1月, 2009 1 次提交
    • A
      async: Asynchronous function calls to speed up kernel boot · 22a9d645
      Arjan van de Ven 提交于
      Right now, most of the kernel boot is strictly synchronous, such that
      various hardware delays are done sequentially.
      
      In order to make the kernel boot faster, this patch introduces
      infrastructure to allow doing some of the initialization steps
      asynchronously, which will hide significant portions of the hardware delays
      in practice.
      
      In order to not change device order and other similar observables, this
      patch does NOT do full parallel initialization.
      
      Rather, it operates more in the way an out of order CPU does; the work may
      be done out of order and asynchronous, but the observable effects
      (instruction retiring for the CPU) are still done in the original sequence.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      22a9d645
  35. 19 12月, 2008 1 次提交
    • P
      "Tree RCU": scalable classic RCU implementation · 64db4cff
      Paul E. McKenney 提交于
      This patch fixes a long-standing performance bug in classic RCU that
      results in massive internal-to-RCU lock contention on systems with
      more than a few hundred CPUs.  Although this patch creates a separate
      flavor of RCU for ease of review and patch maintenance, it is intended
      to replace classic RCU.
      
      This patch still handles stress better than does mainline, so I am still
      calling it ready for inclusion.  This patch is against the -tip tree.
      Nevertheless, experience on an actual 1000+ CPU machine would still be
      most welcome.
      
      Most of the changes noted below were found while creating an rcutiny
      (which should permit ejecting the current rcuclassic) and while doing
      detailed line-by-line documentation.
      
      Updates from v9 (http://lkml.org/lkml/2008/12/2/334):
      
      o	Fixes from remainder of line-by-line code walkthrough,
      	including comment spelling, initialization, undesirable
      	narrowing due to type conversion, removing redundant memory
      	barriers, removing redundant local-variable initialization,
      	and removing redundant local variables.
      
      	I do not believe that any of these fixes address the CPU-hotplug
      	issues that Andi Kleen was seeing, but please do give it a whirl
      	in case the machine is smarter than I am.
      
      	A writeup from the walkthrough may be found at the following
      	URL, in case you are suffering from terminal insomnia or
      	masochism:
      
      	http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf
      
      o	Made rcutree tracing use seq_file, as suggested some time
      	ago by Lai Jiangshan.
      
      o	Added a .csv variant of the rcudata debugfs trace file, to allow
      	people having thousands of CPUs to drop the data into
      	a spreadsheet.	Tested with oocalc and gnumeric.  Updated
      	documentation to suit.
      
      Updates from v8 (http://lkml.org/lkml/2008/11/15/139):
      
      o	Fix a theoretical race between grace-period initialization and
      	force_quiescent_state() that could occur if more than three
      	jiffies were required to carry out the grace-period
      	initialization.  Which it might, if you had enough CPUs.
      
      o	Apply Ingo's printk-standardization patch.
      
      o	Substitute local variables for repeated accesses to global
      	variables.
      
      o	Fix comment misspellings and redundant (but harmless) increments
      	of ->n_rcu_pending (this latter after having explicitly added it).
      
      o	Apply checkpatch fixes.
      
      Updates from v7 (http://lkml.org/lkml/2008/10/10/291):
      
      o	Fixed a number of problems noted by Gautham Shenoy, including
      	the cpu-stall-detection bug that he was having difficulty
      	convincing me was real.  ;-)
      
      o	Changed cpu-stall detection to wait for ten seconds rather than
      	three in order to reduce false positive, as suggested by Ingo
      	Molnar.
      
      o	Produced a design document (http://lwn.net/Articles/305782/).
      	The act of writing this document uncovered a number of both
      	theoretical and "here and now" bugs as noted below.
      
      o	Fix dynticks_nesting accounting confusion, simplify WARN_ON()
      	condition, fix kerneldoc comments, and add memory barriers
      	in dynticks interface functions.
      
      o	Add more data to tracing.
      
      o	Remove unused "rcu_barrier" field from rcu_data structure.
      
      o	Count calls to rcu_pending() from scheduling-clock interrupt
      	to use as a surrogate timebase should jiffies stop counting.
      
      o	Fix a theoretical race between force_quiescent_state() and
      	grace-period initialization.  Yes, initialization does have to
      	go on for some jiffies for this race to occur, but given enough
      	CPUs...
      
      Updates from v6 (http://lkml.org/lkml/2008/9/23/448):
      
      o	Fix a number of checkpatch.pl complaints.
      
      o	Apply review comments from Ingo Molnar and Lai Jiangshan
      	on the stall-detection code.
      
      o	Fix several bugs in !CONFIG_SMP builds.
      
      o	Fix a misspelled config-parameter name so that RCU now announces
      	at boot time if stall detection is configured.
      
      o	Run tests on numerous combinations of configurations parameters,
      	which after the fixes above, now build and run correctly.
      
      Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line):
      
      o	Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a
      	changeset some time ago, and finally got around to retesting
      	this option).
      
      o	Fix some tracing bugs in rcupreempt that caused incorrect
      	totals to be printed.
      
      o	I now test with a more brutal random-selection online/offline
      	script (attached).  Probably more brutal than it needs to be
      	on the people reading it as well, but so it goes.
      
      o	A number of optimizations and usability improvements:
      
      	o	Make rcu_pending() ignore the grace-period timeout when
      		there is no grace period in progress.
      
      	o	Make force_quiescent_state() avoid going for a global
      		lock in the case where there is no grace period in
      		progress.
      
      	o	Rearrange struct fields to improve struct layout.
      
      	o	Make call_rcu() initiate a grace period if RCU was
      		idle, rather than waiting for the next scheduling
      		clock interrupt.
      
      	o	Invoke rcu_irq_enter() and rcu_irq_exit() only when
      		idle, as suggested by Andi Kleen.  I still don't
      		completely trust this change, and might back it out.
      
      	o	Make CONFIG_RCU_TRACE be the single config variable
      		manipulated for all forms of RCU, instead of the prior
      		confusion.
      
      	o	Document tracing files and formats for both rcupreempt
      		and rcutree.
      
      Updates from v4 for those missing v5 given its bad subject line:
      
      o	Separated dynticks interface so that NMIs and irqs call separate
      	functions, greatly simplifying it.  In particular, this code
      	no longer requires a proof of correctness.  ;-)
      
      o	Separated dynticks state out into its own per-CPU structure,
      	avoiding the duplicated accounting.
      
      o	The case where a dynticks-idle CPU runs an irq handler that
      	invokes call_rcu() is now correctly handled, forcing that CPU
      	out of dynticks-idle mode.
      
      o	Review comments have been applied (thank you all!!!).
      	For but one example, fixed the dynticks-ordering issue that
      	Manfred pointed out, saving me much debugging.  ;-)
      
      o	Adjusted rcuclassic and rcupreempt to handle dynticks changes.
      
      Attached is an updated patch to Classic RCU that applies a hierarchy,
      greatly reducing the contention on the top-level lock for large machines.
      This passes 10-hour concurrent rcutorture and online-offline testing on
      128-CPU ppc64 without dynticks enabled, and exposes some timekeeping
      bugs in presence of dynticks (exciting working on a system where
      "sleep 1" hangs until interrupted...), which were fixed in the
      2.6.27 kernel.  It is getting more reliable than mainline by some
      measures, so the next version will be against -tip for inclusion.
      See also Manfred Spraul's recent patches (or his earlier work from
      2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2).
      We will converge onto a common patch in the fullness of time, but are
      currently exploring different regions of the design space.  That said,
      I have already gratefully stolen quite a few of Manfred's ideas.
      
      This patch provides CONFIG_RCU_FANOUT, which controls the bushiness
      of the RCU hierarchy.  Defaults to 32 on 32-bit machines and 64 on
      64-bit machines.  If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT,
      there is no hierarchy.  By default, the RCU initialization code will
      adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA
      architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable
      this balancing, allowing the hierarchy to be exactly aligned to the
      underlying hardware.  Up to two levels of hierarchy are permitted
      (in addition to the root node), allowing up to 16,384 CPUs on 32-bit
      systems and up to 262,144 CPUs on 64-bit systems.  I just know that I
      am going to regret saying this, but this seems more than sufficient
      for the foreseeable future.  (Some architectures might wish to set
      CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs.
      If this becomes a real problem, additional levels can be added, but I
      doubt that it will make a significant difference on real hardware.)
      
      In the common case, a given CPU will manipulate its private rcu_data
      structure and the rcu_node structure that it shares with its immediate
      neighbors.  This can reduce both lock and memory contention by multiple
      orders of magnitude, which should eliminate the need for the strange
      manipulations that are reported to be required when running Linux on
      very large systems.
      
      Some shortcomings:
      
      o	More bugs will probably surface as a result of an ongoing
      	line-by-line code inspection.
      
      	Patches will be provided as required.
      
      o	There are probably hangs, rcutorture failures, &c.  Seems
      	quite stable on a 128-CPU machine, but that is kind of small
      	compared to 4096 CPUs.  However, seems to do better than
      	mainline.
      
      	Patches will be provided as required.
      
      o	The memory footprint of this version is several KB larger
      	than rcuclassic.
      
      	A separate UP-only rcutiny patch will be provided, which will
      	reduce the memory footprint significantly, even compared
      	to the old rcuclassic.  One such patch passes light testing,
      	and has a memory footprint smaller even than rcuclassic.
      	Initial reaction from various embedded guys was "it is not
      	worth it", so am putting it aside.
      
      Credits:
      
      o	Manfred Spraul for ideas, review comments, and bugs spotted,
      	as well as some good friendly competition.  ;-)
      
      o	Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,
      	Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton
      	for reviews and comments.
      
      o	Thomas Gleixner for much-needed help with some timer issues
      	(see patches below).
      
      o	Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos,
      	Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton
      	Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines
      	alive despite my heavy abuse^Wtesting.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      64db4cff
  36. 08 12月, 2008 1 次提交
    • T
      performance counters: core code · 0793a61d
      Thomas Gleixner 提交于
      Implement the core kernel bits of Performance Counters subsystem.
      
      The Linux Performance Counter subsystem provides an abstraction of
      performance counter hardware capabilities. It provides per task and per
      CPU counters, and it provides event capabilities on top of those.
      
      Performance counters are accessed via special file descriptors.
      There's one file descriptor per virtual counter used.
      
      The special file descriptor is opened via the perf_counter_open()
      system call:
      
       int
       perf_counter_open(u32 hw_event_type,
                         u32 hw_event_period,
                         u32 record_type,
                         pid_t pid,
                         int cpu);
      
      The syscall returns the new fd. The fd can be used via the normal
      VFS system calls: read() can be used to read the counter, fcntl()
      can be used to set the blocking mode, etc.
      
      Multiple counters can be kept open at a time, and the counters
      can be poll()ed.
      
      See more details in Documentation/perf-counters.txt.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0793a61d