1. 11 11月, 2008 1 次提交
    • B
      sched: include group statistics in /proc/sched_debug · ff9b48c3
      Bharata B Rao 提交于
      Impact: extend /proc/sched_debug info
      
      Since the statistics of a group entity isn't exported directly from the
      kernel, it becomes difficult to obtain some of the group statistics.
      For example, the current method to obtain exec time of a group entity
      is not always accurate. One has to read the exec times of all
      the tasks(/proc/<pid>/sched) in the group and add them. This method
      fails (or becomes difficult) if we want to collect stats of a group over
      a duration where tasks get created and terminated.
      
      This patch makes it easier to obtain group stats by directly including
      them in /proc/sched_debug. Stats like group exec time would help user
      programs (like LTP) to accurately measure the group fairness.
      
      An example output of group stats from /proc/sched_debug:
      
      cfs_rq[3]:/3/a/1
        .exec_clock                    : 89.598007
        .MIN_vruntime                  : 0.000001
        .min_vruntime                  : 256300.970506
        .max_vruntime                  : 0.000001
        .spread                        : 0.000000
        .spread0                       : -25373.372248
        .nr_running                    : 0
        .load                          : 0
        .yld_exp_empty                 : 0
        .yld_act_empty                 : 0
        .yld_both_empty                : 0
        .yld_count                     : 4474
        .sched_switch                  : 0
        .sched_count                   : 40507
        .sched_goidle                  : 12686
        .ttwu_count                    : 15114
        .ttwu_local                    : 11950
        .bkl_count                     : 67
        .nr_spread_over                : 0
        .shares                        : 0
        .se->exec_start                : 113676.727170
        .se->vruntime                  : 1592.612714
        .se->sum_exec_runtime          : 89.598007
        .se->wait_start                : 0.000000
        .se->sleep_start               : 0.000000
        .se->block_start               : 0.000000
        .se->sleep_max                 : 0.000000
        .se->block_max                 : 0.000000
        .se->exec_max                  : 1.000282
        .se->slice_max                 : 1.999750
        .se->wait_max                  : 54.981093
        .se->wait_sum                  : 217.610521
        .se->wait_count                : 50
        .se->load.weight               : 2
      Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com>
      Acked-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Acked-by: NDhaval Giani <dhaval@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ff9b48c3
  2. 04 11月, 2008 1 次提交
  3. 30 10月, 2008 1 次提交
  4. 10 10月, 2008 1 次提交
    • L
      [PATCH] signal, procfs: some lock_task_sighand() users do not need rcu_read_lock() · a6bebbc8
      Lai Jiangshan 提交于
      lock_task_sighand() make sure task->sighand is being protected,
      so we do not need rcu_read_lock().
      [ exec() will get task->sighand->siglock before change task->sighand! ]
      
      But code using rcu_read_lock() _just_ to protect lock_task_sighand()
      only appear in procfs. (and some code in procfs use lock_task_sighand()
      without such redundant protection.)
      
      Other subsystem may put lock_task_sighand() into rcu_read_lock()
      critical region, but these rcu_read_lock() are used for protecting
      "for_each_process()", "find_task_by_vpid()" etc. , not for protecting
      lock_task_sighand().
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      [ok from Oleg]
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      a6bebbc8
  5. 27 6月, 2008 2 次提交
  6. 20 6月, 2008 1 次提交
  7. 29 5月, 2008 1 次提交
  8. 06 5月, 2008 1 次提交
    • P
      sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK · 3e51f33f
      Peter Zijlstra 提交于
      this replaces the rq->clock stuff (and possibly cpu_clock()).
      
       - architectures that have an 'imperfect' hardware clock can set
         CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
      
       - the 'jiffie' window might be superfulous when we update tick_gtod
         before the __update_sched_clock() call in sched_clock_tick()
      
       - cpu_clock() might be implemented as:
      
           sched_clock_cpu(smp_processor_id())
      
         if the accuracy proves good enough - how far can TSC drift in a
         single jiffie when considering the filtering and idle hooks?
      
      [ mingo@elte.hu: various fixes and cleanups ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3e51f33f
  9. 01 5月, 2008 1 次提交
    • R
      rename div64_64 to div64_u64 · 6f6d6a1a
      Roman Zippel 提交于
      Rename div64_64 to div64_u64 to make it consistent with the other divide
      functions, so it clearly includes the type of the divide.  Move its definition
      to math64.h as currently no architecture overrides the generic implementation.
       They can still override it of course, but the duplicated declarations are
      avoided.
      Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
      Cc: Avi Kivity <avi@qumranet.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f6d6a1a
  10. 29 4月, 2008 1 次提交
  11. 20 4月, 2008 3 次提交
  12. 19 3月, 2008 1 次提交
    • I
      sched: improve affine wakeups · 4ae7d5ce
      Ingo Molnar 提交于
      improve affine wakeups. Maintain the 'overlap' metric based on CFS's
      sum_exec_runtime - which means the amount of time a task executes
      after it wakes up some other task.
      
      Use the 'overlap' for the wakeup decisions: if the 'overlap' is short,
      it means there's strong workload coupling between this task and the
      woken up task. If the 'overlap' is large then the workload is decoupled
      and the scheduler will move them to separate CPUs more easily.
      
      ( Also slightly move the preempt_check within try_to_wake_up() - this has
        no effect on functionality but allows 'early wakeups' (for still-on-rq
        tasks) to be correctly accounted as well.)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4ae7d5ce
  13. 26 1月, 2008 2 次提交
  14. 31 12月, 2007 1 次提交
    • I
      sched: fix gcc warnings · 90b2628f
      Ingo Molnar 提交于
      Meelis Roos reported these warnings on sparc64:
      
        CC      kernel/sched.o
        In file included from kernel/sched.c:879:
        kernel/sched_debug.c: In function 'nsec_high':
        kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a cast
      
      the debug check in do_div() is over-eager here, because the long long
      is always positive in these places. Mark this by casting them to
      unsigned long long.
      
      no change in code output:
      
         text    data     bss     dec     hex filename
        51471    6582     376   58429    e43d sched.o.before
        51471    6582     376   58429    e43d sched.o.after
      
        md5:
         7f7729c111f185bf3ccea4d542abc049  sched.o.before.asm
         7f7729c111f185bf3ccea4d542abc049  sched.o.after.asm
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      90b2628f
  15. 28 11月, 2007 1 次提交
  16. 27 11月, 2007 1 次提交
  17. 10 11月, 2007 1 次提交
  18. 25 10月, 2007 1 次提交
    • P
      sched: fix unconditional irq lock · ab63a633
      Peter Zijlstra 提交于
      Lockdep noticed that this lock can also be taken from hardirq context, and can
      thus not unconditionally disable/enable irqs.
      
       WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on()
        [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
        [show_trace+18/32] show_trace+0x12/0x20
        [dump_stack+22/32] dump_stack+0x16/0x20
        [trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0
        [_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30
        [sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080
        [show_state_filter+326/368] show_state_filter+0x146/0x170
        [sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10
        [__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120
        [handle_sysrq+40/64] handle_sysrq+0x28/0x40
        [kbd_event+1045/1680] kbd_event+0x415/0x690
        [input_pass_event+206/208] input_pass_event+0xce/0xd0
        [input_handle_event+170/928] input_handle_event+0xaa/0x3a0
        [input_event+95/112] input_event+0x5f/0x70
        [atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0
        [serio_interrupt+59/128] serio_interrupt+0x3b/0x80
        [i8042_interrupt+263/576] i8042_interrupt+0x107/0x240
        [handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60
        [handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140
        [do_IRQ+64/128] do_IRQ+0x40/0x80
        [common_interrupt+46/52] common_interrupt+0x2e/0x34
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ab63a633
  19. 19 10月, 2007 1 次提交
    • K
      sched: reduce schedstat variable overhead a bit · 480b9434
      Ken Chen 提交于
      schedstat is useful in investigating CPU scheduler behavior.  Ideally,
      I think it is beneficial to have it on all the time.  However, the
      cost of turning it on in production system is quite high, largely due
      to number of events it collects and also due to its large memory
      footprint.
      
      Most of the fields probably don't need to be full 64-bit on 64-bit
      arch.  Rolling over 4 billion events will most like take a long time
      and user space tool can be made to accommodate that.  I'm proposing
      kernel to cut back most of variable width on 64-bit system.  (note,
      the following patch doesn't affect 32-bit system).
      Signed-off-by: NKen Chen <kenchen@google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      480b9434
  20. 15 10月, 2007 17 次提交