1. 13 11月, 2009 2 次提交
  2. 05 11月, 2009 2 次提交
    • M
      sched: Fix affinity logic in select_task_rq_fair() · fd210738
      Mike Galbraith 提交于
      Ingo Molnar reported:
      
      [   26.804000] BUG: using smp_processor_id() in preemptible [00000000] code: events/1/10
      [   26.808000] caller is vmstat_update+0x26/0x70
      [   26.812000] Pid: 10, comm: events/1 Not tainted 2.6.32-rc5 #6887
      [   26.816000] Call Trace:
      [   26.820000]  [<c1924a24>] ? printk+0x28/0x3c
      [   26.824000]  [<c13258a0>] debug_smp_processor_id+0xf0/0x110
      [   26.824000] mount used greatest stack depth: 1464 bytes left
      [   26.828000]  [<c111d086>] vmstat_update+0x26/0x70
      [   26.832000]  [<c1086418>] worker_thread+0x188/0x310
      [   26.836000]  [<c10863b7>] ? worker_thread+0x127/0x310
      [   26.840000]  [<c108d310>] ? autoremove_wake_function+0x0/0x60
      [   26.844000]  [<c1086290>] ? worker_thread+0x0/0x310
      [   26.848000]  [<c108cf0c>] kthread+0x7c/0x90
      [   26.852000]  [<c108ce90>] ? kthread+0x0/0x90
      [   26.856000]  [<c100c0a7>] kernel_thread_helper+0x7/0x10
      [   26.860000] BUG: using smp_processor_id() in preemptible [00000000] code: events/1/10
      [   26.864000] caller is vmstat_update+0x3c/0x70
      
      Because this commit:
      
        a1f84a3a: sched: Check for an idle shared cache in select_task_rq_fair()
      
      broke ->cpus_allowed.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: arjan@infradead.org
      Cc: <stable@kernel.org>
      LKML-Reference: <1257415066.12867.1.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fd210738
    • M
      sched: Check for an idle shared cache in select_task_rq_fair() · a1f84a3a
      Mike Galbraith 提交于
      When waking affine, check for an idle shared cache, and if
      found, wake to that CPU/sibling instead of the waker's CPU.
      
      This improves pgsql+oltp ramp up by roughly 8%. Possibly more
      for other loads, depending on overlap. The trade-off is a
      roughly 1% peak downturn if tasks are truly synchronous.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: <stable@kernel.org>
      LKML-Reference: <1256654138.17752.7.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a1f84a3a
  3. 24 9月, 2009 1 次提交
  4. 21 9月, 2009 1 次提交
  5. 19 9月, 2009 1 次提交
  6. 18 9月, 2009 1 次提交
  7. 17 9月, 2009 3 次提交
    • P
      sched: Fix SD_POWERSAVING_BALANCE|SD_PREFER_LOCAL vs SD_WAKE_AFFINE · 29cd8bae
      Peter Zijlstra 提交于
      The SD_POWERSAVING_BALANCE|SD_PREFER_LOCAL code can break out of
      the domain iteration early, making us miss the SD_WAKE_AFFINE bits.
      
      Fix this by continuing iteration until there is no need for a
      larger domain.
      
      This also cleans up the cgroup stuff a bit, but not having two
      update_shares() invocations.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      29cd8bae
    • P
      sched: Stop buddies from hogging the system · de69a80b
      Peter Zijlstra 提交于
      Clear buddies more agressively.
      
      The (theoretical, haven't actually observed any of this) problem is
      that when we do not select either buddy in pick_next_entity()
      because they are too far ahead of the left-most task, we do not
      clear the buddies.
      
      This means that as soon as we service the left-most task, these
      same buddies will be tried again on the next schedule. Now if the
      left-most task was a pure hog, it wouldn't have done any wakeups
      and it wouldn't have set buddies of its own. That leads to the old
      buddies dominating, which would lead to bad latencies.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      de69a80b
    • P
      sched: Add new wakeup preemption mode: WAKEUP_RUNNING · ad4b78bb
      Peter Zijlstra 提交于
      Create a new wakeup preemption mode, preempt towards tasks that run
      shorter on avg. It sets next buddy to be sure we actually run the task
      we preempted for.
      
      Test results:
      
       root@twins:~# while :; do :; done &
       [1] 6537
       root@twins:~# while :; do :; done &
       [2] 6538
       root@twins:~# while :; do :; done &
       [3] 6539
       root@twins:~# while :; do :; done &
       [4] 6540
      
       root@twins:/home/peter# ./latt -c4 sleep 4
       Entries: 48 (clients=4)
      
       Averages:
       ------------------------------
              Max          4750 usec
              Avg           497 usec
              Stdev         737 usec
      
       root@twins:/home/peter# echo WAKEUP_RUNNING > /debug/sched_features
      
       root@twins:/home/peter# ./latt -c4 sleep 4
       Entries: 48 (clients=4)
      
       Averages:
       ------------------------------
              Max            14 usec
              Avg             5 usec
              Stdev           3 usec
      
      Disabled by default - needs more testing.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      LKML-Reference: <new-submission>
      ad4b78bb
  8. 16 9月, 2009 6 次提交
  9. 15 9月, 2009 12 次提交
  10. 14 9月, 2009 1 次提交
    • I
      perf_counter, sched: Add sched_stat_runtime tracepoint · f977bb49
      Ingo Molnar 提交于
      This allows more precise tracking of how the scheduler accounts
      (and acts upon) a task having spent N nanoseconds of CPU time.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f977bb49
  11. 11 9月, 2009 1 次提交
    • I
      sched: Fix sched::sched_stat_wait tracepoint field · e1f84508
      Ingo Molnar 提交于
      This weird perf trace output:
      
        cc1-9943  [001]  2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]
      
      Is caused by setting one component field of the delta to zero
      a bit too early. Move it to later.
      
      ( Note, this does not affect the NEW_FAIR_SLEEPERS interactivity bug,
        it's just a reporting bug in essence. )
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Nikos Chantziaras <realnc@arcor.de>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <4AA93D34.8040500@arcor.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e1f84508
  12. 09 9月, 2009 2 次提交
  13. 08 9月, 2009 3 次提交
    • M
      sched: Ensure that a child can't gain time over it's parent after fork() · b5d9d734
      Mike Galbraith 提交于
      A fork/exec load is usually "pass the baton", so the child
      should never be placed behind the parent.  With START_DEBIT we
      make room for the new task, but with child_runs_first, that
      room comes out of the _parent's_ hide. There's nothing to say
      that the parent wasn't ahead of min_vruntime at fork() time,
      which means that the "baton carrier", who is essentially the
      parent in drag, can gain time and increase scheduling latencies
      for waiters.
      
      With NEW_FAIR_SLEEPERS + START_DEBIT + child_runs_first
      enabled, we essentially pass the sleeper fairness off to the
      child, which is fine, but if we don't base placement on the
      parent's updated vruntime, we can end up compounding latency
      woes if the child itself then does fork/exec.  The debit
      incurred at fork doesn't hurt the parent who is then going to
      sleep and maybe exit, but the child who acquires the error
      harms all comers.
      
      This improves latencies of make -j<n> kernel build workloads.
      Reported-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b5d9d734
    • P
      sched: Deal with low-load in wake_affine() · 71a29aa7
      Peter Zijlstra 提交于
      wake_affine() would always fail under low-load situations where
      both prev and this were idle, because adding a single task will
      always be a significant imbalance, even if there's nothing
      around that could balance it.
      
      Deal with this by allowing imbalance when there's nothing you
      can do about it.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      71a29aa7
    • P
      sched: Remove short cut from select_task_rq_fair() · cdd2ab3d
      Peter Zijlstra 提交于
      select_task_rq_fair() incorrectly skips the wake_affine()
      logic, remove this.
      
      When prev_cpu == this_cpu, the code jumps straight to the
      wake_idle() logic, this doesn't give the wake_affine() logic
      the chance to pin the task to this cpu.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cdd2ab3d
  14. 02 9月, 2009 2 次提交
  15. 02 8月, 2009 2 次提交