1. 24 5月, 2011 1 次提交
  2. 23 5月, 2011 7 次提交
  3. 21 5月, 2011 3 次提交
  4. 20 5月, 2011 7 次提交
    • N
      sched: Increase SCHED_LOAD_SCALE resolution · c8b28116
      Nikhil Rao 提交于
      Introduce SCHED_LOAD_RESOLUTION, which scales is added to
      SCHED_LOAD_SHIFT and increases the resolution of
      SCHED_LOAD_SCALE. This patch sets the value of
      SCHED_LOAD_RESOLUTION to 10, scaling up the weights for all
      sched entities by a factor of 1024. With this extra resolution,
      we can handle deeper cgroup hiearchies and the scheduler can do
      better shares distribution and load load balancing on larger
      systems (especially for low weight task groups).
      
      This does not change the existing user interface, the scaled
      weights are only used internally. We do not modify
      prio_to_weight values or inverses, but use the original weights
      when calculating the inverse which is used to scale execution
      time delta in calc_delta_mine(). This ensures we do not lose
      accuracy when accounting time to the sched entities. Thanks to
      Nikunj Dadhania for fixing an bug in c_d_m() that broken fairness.
      
      Below is some analysis of the performance costs/improvements of
      this patch.
      
      1. Micro-arch performance costs:
      
      Experiment was to run Ingo's pipe_test_100k 200 times with the
      task pinned to one cpu. I measured instruction, cycles and
      stalled-cycles for the runs. See:
      
         http://thread.gmane.org/gmane.linux.kernel/1129232/focus=1129389
      
      for more info.
      
      -tip (baseline):
      
       Performance counter stats for '/root/load-scale/pipe-test-100k' (200 runs):
      
             964,991,769 instructions             #    0.82  insns per cycle
                                                  #    0.33  stalled cycles per insn
                                                  #    ( +-  0.05% )
           1,171,186,635 cycles                   #    0.000 GHz                      ( +-  0.08% )
             306,373,664 stalled-cycles-backend   #   26.16% backend  cycles idle     ( +-  0.28% )
             314,933,621 stalled-cycles-frontend  #   26.89% frontend cycles idle     ( +-  0.34% )
      
              1.122405684  seconds time elapsed  ( +-  0.05% )
      
      -tip+patches:
      
       Performance counter stats for './load-scale/pipe-test-100k' (200 runs):
      
             963,624,821 instructions             #    0.82  insns per cycle
                                                  #    0.33  stalled cycles per insn
                                                  #    ( +-  0.04% )
           1,175,215,649 cycles                   #    0.000 GHz                      ( +-  0.08% )
             315,321,126 stalled-cycles-backend   #   26.83% backend  cycles idle     ( +-  0.28% )
             316,835,873 stalled-cycles-frontend  #   26.96% frontend cycles idle     ( +-  0.29% )
      
              1.122238659  seconds time elapsed  ( +-  0.06% )
      
      With this patch, instructions decrease by ~0.10% and cycles
      increase by 0.27%. This doesn't look statistically significant.
      The number of stalled cycles in the backend increased from
      26.16% to 26.83%. This can be attributed to the shifts we do in
      c_d_m() and other places. The fraction of stalled cycles in the
      frontend remains about the same, at 26.96% compared to 26.89% in -tip.
      
      2. Balancing low-weight task groups
      
      Test setup: run 50 tasks with random sleep/busy times (biased
      around 100ms) in a low weight container (with cpu.shares = 2).
      Measure %idle as reported by mpstat over a 10s window.
      
      -tip (baseline):
      
      06:47:48 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle    intr/s
      06:47:49 PM  all   94.32    0.00    0.06    0.00    0.00    0.00    0.00    0.00    5.62  15888.00
      06:47:50 PM  all   94.57    0.00    0.62    0.00    0.00    0.00    0.00    0.00    4.81  16180.00
      06:47:51 PM  all   94.69    0.00    0.06    0.00    0.00    0.00    0.00    0.00    5.25  15966.00
      06:47:52 PM  all   95.81    0.00    0.00    0.00    0.00    0.00    0.00    0.00    4.19  16053.00
      06:47:53 PM  all   94.88    0.06    0.00    0.00    0.00    0.00    0.00    0.00    5.06  15984.00
      06:47:54 PM  all   93.31    0.00    0.00    0.00    0.00    0.00    0.00    0.00    6.69  15806.00
      06:47:55 PM  all   94.19    0.00    0.06    0.00    0.00    0.00    0.00    0.00    5.75  15896.00
      06:47:56 PM  all   92.87    0.00    0.00    0.00    0.00    0.00    0.00    0.00    7.13  15716.00
      06:47:57 PM  all   94.88    0.00    0.00    0.00    0.00    0.00    0.00    0.00    5.12  15982.00
      06:47:58 PM  all   95.44    0.00    0.00    0.00    0.00    0.00    0.00    0.00    4.56  16075.00
      Average:     all   94.49    0.01    0.08    0.00    0.00    0.00    0.00    0.00    5.42  15954.60
      
      -tip+patches:
      
      06:47:03 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle    intr/s
      06:47:04 PM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  16630.00
      06:47:05 PM  all   99.69    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.31  16580.20
      06:47:06 PM  all   99.69    0.00    0.06    0.00    0.00    0.00    0.00    0.00    0.25  16596.00
      06:47:07 PM  all   99.20    0.00    0.74    0.00    0.00    0.06    0.00    0.00    0.00  17838.61
      06:47:08 PM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  16540.00
      06:47:09 PM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  16575.00
      06:47:10 PM  all  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  16614.00
      06:47:11 PM  all   99.94    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.06  16588.00
      06:47:12 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00    0.00  16593.00
      06:47:13 PM  all   99.94    0.00    0.06    0.00    0.00    0.00    0.00    0.00    0.00  16551.00
      Average:     all   99.84    0.00    0.09    0.00    0.00    0.01    0.00    0.00    0.06  16711.58
      
      We see an improvement in idle% on the system (drops from 5.42% on -tip to 0.06%
      with the patches).
      
      We see an improvement in idle% on the system (drops from 5.42%
      on -tip to 0.06% with the patches).
      Signed-off-by: NNikhil Rao <ncrao@google.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
      Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Cc: Stephan Barwolf <stephan.baerwolf@tu-ilmenau.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1305754668-18792-1-git-send-email-ncrao@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c8b28116
    • N
      sched: Introduce SCHED_POWER_SCALE to scale cpu_power calculations · 1399fa78
      Nikhil Rao 提交于
      SCHED_LOAD_SCALE is used to increase nice resolution and to
      scale cpu_power calculations in the scheduler. This patch
      introduces SCHED_POWER_SCALE and converts all uses of
      SCHED_LOAD_SCALE for scaling cpu_power to use SCHED_POWER_SCALE
      instead.
      
      This is a preparatory patch for increasing the resolution of
      SCHED_LOAD_SCALE, and there is no need to increase resolution
      for cpu_power calculations.
      Signed-off-by: NNikhil Rao <ncrao@google.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
      Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Cc: Stephan Barwolf <stephan.baerwolf@tu-ilmenau.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Link: http://lkml.kernel.org/r/1305738580-9924-3-git-send-email-ncrao@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1399fa78
    • N
      sched: Cleanup set_load_weight() · f05998d4
      Nikhil Rao 提交于
      Avoid using long repetitious names; make this simpler and nicer
      to read. No functional change introduced in this patch.
      Signed-off-by: NNikhil Rao <ncrao@google.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
      Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Cc: Stephan Barwolf <stephan.baerwolf@tu-ilmenau.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Link: http://lkml.kernel.org/r/1305738580-9924-2-git-send-email-ncrao@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      f05998d4
    • T
      clockevents/source: Use u64 to make 32bit happy · c0e299b1
      Thomas Gleixner 提交于
      unsigned long is not 64bit on 32bit machine.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      c0e299b1
    • S
      extable, core_kernel_data(): Make sure all archs define _sdata · a2d063ac
      Steven Rostedt 提交于
      A new utility function (core_kernel_data()) is used to determine if a
      passed in address is part of core kernel data or not. It may or may not
      return true for RO data, but this utility must work for RW data.
      
      Thus both _sdata and _edata must be defined and continuous,
      without .init sections that may later be freed and replaced by
      volatile memory (memory that can be freed).
      
      This utility function is used to determine if data is safe from
      ever being freed. Thus it should return true for all RW global
      data that is not in a module or has been allocated, or false
      otherwise.
      
      Also change core_kernel_data() back to the more precise _sdata condition
      and document the function.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NHirokazu Takata <takata@linux-m32r.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Helge Deller <deller@gmx.de>
      Cc: JamesE.J.Bottomley <jejb@parisc-linux.org>
      Link: http://lkml.kernel.org/r/1305855298.1465.19.camel@gandalf.stny.rr.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      ----
       arch/alpha/kernel/vmlinux.lds.S   |    1 +
       arch/m32r/kernel/vmlinux.lds.S    |    1 +
       arch/m68k/kernel/vmlinux-std.lds  |    2 ++
       arch/m68k/kernel/vmlinux-sun3.lds |    1 +
       arch/mips/kernel/vmlinux.lds.S    |    1 +
       arch/parisc/kernel/vmlinux.lds.S  |    3 +++
       kernel/extable.c                  |   12 +++++++++++-
       7 files changed, 20 insertions(+), 1 deletion(-)
      a2d063ac
    • I
      core_kernel_data(): Fix architectures that do not define _sdata · c5fc4721
      Ingo Molnar 提交于
      Some architectures such as Alpha do not define _sdata but _data:
      
        kernel/built-in.o: In function `core_kernel_data':
        kernel/extable.c:77: undefined reference to `_sdata'
      
      So expand the scope of the data range to the text addresses too,
      this might be more correct anyway because this way we can
      cover readonly variables as well.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-i878c8a0e0g0ep4v7i6vxnhz@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c5fc4721
    • P
      Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" · 80d02085
      Paul E. McKenney 提交于
      This reverts commit e59fb312.
      
      This reversion was due to (extreme) boot-time slowdowns on SPARC seen by
      Yinghai Lu and on x86 by Ingo
      .
      This is a non-trivial reversion due to intervening commits.
      
      Conflicts:
      
      	Documentation/RCU/trace.txt
      	kernel/rcutree.c
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      80d02085
  5. 19 5月, 2011 22 次提交