1. 24 11月, 2009 2 次提交
    • T
      sched: Optimize branch hint in context_switch() · 710390d9
      Tim Blechmann 提交于
      Branch hint profiling on my nehalem machine showed over 90%
      incorrect branch hints:
      
        10420275 170645395  94 context_switch                 sched.c
         3043
        10408421 171098521  94 context_switch                 sched.c
         3050
      Signed-off-by: NTim Blechmann <tim@klingt.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0BBB9F.6080304@klingt.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      710390d9
    • J
      sched_feat_write(): Update ppos instead of file->f_pos · 42994724
      Jan Blunck 提交于
      sched_feat_write() should update ppos instead of file->f_pos.
      
      (This reduces some BKL dependencies of this code.)
      Signed-off-by: NJan Blunck <jblunck@suse.de>
      Cc: jkacur@redhat.com
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jamie Lokier <jamie@shareable.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      LKML-Reference: <1258735245-25826-8-git-send-email-jblunck@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      42994724
  2. 12 11月, 2009 1 次提交
    • H
      sched: Fix granularity of task_u/stime() · 761b1d26
      Hidetoshi Seto 提交于
      Originally task_s/utime() were designed to return clock_t but
      later changed to return cputime_t by following commit:
      
        commit efe567fc
        Author: Christian Borntraeger <borntraeger@de.ibm.com>
        Date:   Thu Aug 23 15:18:02 2007 +0200
      
      It only changed the type of return value, but not the
      implementation. As the result the granularity of task_s/utime()
      is still that of clock_t, not that of cputime_t.
      
      So using task_s/utime() in __exit_signal() makes values
      accumulated to the signal struct to be rounded and coarse
      grained.
      
      This patch removes casts to clock_t in task_u/stime(), to keep
      granularity of cputime_t over the calculation.
      
      v2:
        Use div_u64() to avoid error "undefined reference to `__udivdi3`"
        on some 32bit systems.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: xiyou.wangcong@gmail.com
      Cc: Spencer Candland <spencer@bluehost.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      LKML-Reference: <4AFB9029.9000208@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      761b1d26
  3. 11 11月, 2009 1 次提交
  4. 10 11月, 2009 1 次提交
  5. 08 11月, 2009 1 次提交
  6. 05 11月, 2009 1 次提交
  7. 04 11月, 2009 2 次提交
  8. 26 10月, 2009 1 次提交
  9. 12 10月, 2009 1 次提交
  10. 09 10月, 2009 2 次提交
  11. 06 10月, 2009 1 次提交
  12. 05 10月, 2009 1 次提交
  13. 02 10月, 2009 1 次提交
  14. 24 9月, 2009 2 次提交
  15. 22 9月, 2009 1 次提交
    • A
      cpuidle: fix the menu governor to boost IO performance · 69d25870
      Arjan van de Ven 提交于
      Fix the menu idle governor which balances power savings, energy efficiency
      and performance impact.
      
      The reason for a reworked governor is that there have been serious
      performance issues reported with the existing code on Nehalem server
      systems.
      
      To show this I'm sure Andrew wants to see benchmark results:
      (benchmark is "fio", "no cstates" is using "idle=poll")
      
      		no cstates	current linux	new algorithm
      1 disk		107 Mb/s	85 Mb/s		105 Mb/s
      2 disks		215 Mb/s	123 Mb/s	209 Mb/s
      12 disks	590 Mb/s	320 Mb/s	585 Mb/s
      
      In various power benchmark measurements, no degredation was found by our
      measurement&diagnostics team.  Obviously a small percentage more power was
      used in the "fio" benchmark, due to the much higher performance.
      
      While it would be a novel idea to describe the new algorithm in this
      commit message, I cheaped out and described it in comments in the code
      instead.
      
      [changes since first post: spelling fixes from akpm, review feedback,
      folded menu-tng into menu.c]
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69d25870
  16. 21 9月, 2009 4 次提交
    • I
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar 提交于
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Reviewed-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd6c482
    • P
      sched: Simplify sys_sched_rr_get_interval() system call · 0d721cea
      Peter Williams 提交于
      By removing the need for it to know details of scheduling classes.
      
      This allows PlugSched to define orthogonal scheduling classes.
      Signed-off-by: NPeter Williams <pwil3058@bigpond.net.au>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <06d1b89ee15a0eef82d7.1253496713@mudlark.pw.nest>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d721cea
    • Y
      sched: Fix potential NULL derference of doms_cur · cb5fd13f
      Yong Zhang 提交于
      If CONFIG_CPUMASK_OFFSTACK is enabled but doms_cur alloc failed in
      arch_init_sched_domains(), doms_cur will move back to
      fallback_doms. But this time, fallback_doms has not been
      initialized yet.
      Signed-off-by: NYong Zhang <yong.zhang0@gmail.com>
      Cc: a.p.zijlstra@chello.nl
      LKML-Reference: <1252930816-7672-1-git-send-email-yong.zhang0@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cb5fd13f
    • A
      sched: Fix raciness in runqueue_is_locked() · 89f19f04
      Andrew Morton 提交于
      runqueue_is_locked() is unavoidably racy due to a poor interface design.
      It does
      
      	cpu = get_cpu()
      	ret = some_perpcu_thing(cpu);
      	put_cpu(cpu);
      	return ret;
      
      Its return value is unreliable.
      
      Fix.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <200909191855.n8JItiko022148@imap1.linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      89f19f04
  17. 17 9月, 2009 2 次提交
    • P
      sched: Add new wakeup preemption mode: WAKEUP_RUNNING · ad4b78bb
      Peter Zijlstra 提交于
      Create a new wakeup preemption mode, preempt towards tasks that run
      shorter on avg. It sets next buddy to be sure we actually run the task
      we preempted for.
      
      Test results:
      
       root@twins:~# while :; do :; done &
       [1] 6537
       root@twins:~# while :; do :; done &
       [2] 6538
       root@twins:~# while :; do :; done &
       [3] 6539
       root@twins:~# while :; do :; done &
       [4] 6540
      
       root@twins:/home/peter# ./latt -c4 sleep 4
       Entries: 48 (clients=4)
      
       Averages:
       ------------------------------
              Max          4750 usec
              Avg           497 usec
              Stdev         737 usec
      
       root@twins:/home/peter# echo WAKEUP_RUNNING > /debug/sched_features
      
       root@twins:/home/peter# ./latt -c4 sleep 4
       Entries: 48 (clients=4)
      
       Averages:
       ------------------------------
              Max            14 usec
              Avg             5 usec
              Stdev           3 usec
      
      Disabled by default - needs more testing.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      LKML-Reference: <new-submission>
      ad4b78bb
    • I
      sched: Fix TASK_WAKING & loadaverage breakage · eb24073b
      Ingo Molnar 提交于
      Fix this:
      
      top - 21:54:00 up  2:59,  1 user,  load average: 432512.33, 426421.74, 417432.74
      
      Which happens because we now set TASK_WAKING before activate_task().
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eb24073b
  18. 16 9月, 2009 2 次提交
  19. 15 9月, 2009 11 次提交
  20. 04 9月, 2009 2 次提交
    • I
      sched: Fix dynamic power-balancing crash · d7ea17a7
      Ingo Molnar 提交于
      This crash:
      
      [ 1774.088275] divide error: 0000 [#1] SMP
      [ 1774.100355] CPU 13
      [ 1774.102498] Modules linked in:
      [ 1774.105631] Pid: 30881, comm: hackbench Not tainted 2.6.31-rc8-tip-01308-g484d664-dirty #1629 X8DTN
      [ 1774.114807] RIP: 0010:[<ffffffff81041c38>]  [<ffffffff81041c38>]
      sched_balance_self+0x19b/0x2d4
      
      Triggers because update_group_power() modifies the sd tree and does
      temporary calculations there - not considering that other CPUs
      could observe intermediate values, such as the zero initial value.
      
      Calculate it in a temporary variable instead. (we need no memory
      barrier as these are all statistical values anyway)
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20090904092742.GA11014@elte.hu>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d7ea17a7
    • P
      sched: Remove reciprocal for cpu_power · 18a3885f
      Peter Zijlstra 提交于
      Its a source of fail, also, now that cpu_power is dynamical,
      its a waste of time.
      
      before:
      <idle>-0   [000]   132.877936: find_busiest_group: avg_load: 0 group_load: 8241 power: 1
      
      after:
      bash-1689  [001]   137.862151: find_busiest_group: avg_load: 10636288 group_load: 10387 power: 1
      
      [ v2: build fix from From: Andreas Herrmann ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Acked-by: NGautham R Shenoy <ego@in.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      LKML-Reference: <20090901083826.425896304@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      18a3885f