1. 10 2月, 2014 7 次提交
  2. 28 1月, 2014 9 次提交
  3. 23 1月, 2014 1 次提交
  4. 22 1月, 2014 1 次提交
    • M
      sched: add tracepoints related to NUMA task migration · 286549dc
      Mel Gorman 提交于
      This patch adds three tracepoints
       o trace_sched_move_numa	when a task is moved to a node
       o trace_sched_swap_numa	when a task is swapped with another task
       o trace_sched_stick_numa	when a numa-related migration fails
      
      The tracepoints allow the NUMA scheduler activity to be monitored and the
      following high-level metrics can be calculated
      
       o NUMA migrated stuck	 nr trace_sched_stick_numa
       o NUMA migrated idle	 nr trace_sched_move_numa
       o NUMA migrated swapped nr trace_sched_swap_numa
       o NUMA local swapped	 trace_sched_swap_numa src_nid == dst_nid (should never happen)
       o NUMA remote swapped	 trace_sched_swap_numa src_nid != dst_nid (should == NUMA migrated swapped)
       o NUMA group swapped	 trace_sched_swap_numa src_ngid == dst_ngid
      			 Maybe a small number of these are acceptable
      			 but a high number would be a major surprise.
      			 It would be even worse if bounces are frequent.
       o NUMA avg task migs.	 Average number of migrations for tasks
       o NUMA stddev task mig	 Self-explanatory
       o NUMA max task migs.	 Maximum number of migrations for a single task
      
      In general the intent of the tracepoints is to help diagnose problems
      where automatic NUMA balancing appears to be doing an excessive amount
      of useless work.
      
      [akpm@linux-foundation.org: remove semicolon-after-if, repair coding-style]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      286549dc
  5. 13 1月, 2014 8 次提交
  6. 12 1月, 2014 1 次提交
    • R
      sched: Calculate effective load even if local weight is 0 · 9722c2da
      Rik van Riel 提交于
      Thomas Hellstrom bisected a regression where erratic 3D performance is
      experienced on virtual machines as measured by glxgears. It identified
      commit 58d081b5 ("sched/numa: Avoid overloading CPUs on a preferred NUMA
      node") as the problem which had modified the behaviour of effective_load.
      
      Effective load calculates the difference to the system-wide load if a
      scheduling entity was moved to another CPU. The task group is not heavier
      as a result of the move but overall system load can increase/decrease as a
      result of the change. Commit 58d081b5 ("sched/numa: Avoid overloading CPUs
      on a preferred NUMA node") changed effective_load to make it suitable for
      calculating if a particular NUMA node was compute overloaded. To reduce
      the cost of the function, it assumed that a current sched entity weight
      of 0 was uninteresting but that is not the case.
      
      wake_affine() uses a weight of 0 for sync wakeups on the grounds that it
      is assuming the waking task will sleep and not contribute to load in the
      near future. In this case, we still want to calculate the effective load
      of the sched entity hierarchy. As effective_load is no longer used by
      task_numa_compare since commit fb13c7ee (sched/numa: Use a system-wide
      search to find swap/migration candidates), this patch simply restores the
      historical behaviour.
      Reported-and-tested-by: NThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: NRik van Riel <riel@redhat.com>
      [ Wrote changelog]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140106113912.GC6178@suse.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9722c2da
  7. 19 12月, 2013 1 次提交
  8. 17 12月, 2013 4 次提交
  9. 11 12月, 2013 1 次提交
    • P
      sched/fair: Rework sched_fair time accounting · 9dbdb155
      Peter Zijlstra 提交于
      Christian suffers from a bad BIOS that wrecks his i5's TSC sync. This
      results in him occasionally seeing time going backwards - which
      crashes the scheduler ...
      
      Most of our time accounting can actually handle that except the most
      common one; the tick time update of sched_fair.
      
      There is a further problem with that code; previously we assumed that
      because we get a tick every TICK_NSEC our time delta could never
      exceed 32bits and math was simpler.
      
      However, ever since Frederic managed to get NO_HZ_FULL merged; this is
      no longer the case since now a task can run for a long time indeed
      without getting a tick. It only takes about ~4.2 seconds to overflow
      our u32 in nanoseconds.
      
      This means we not only need to better deal with time going backwards;
      but also means we need to be able to deal with large deltas.
      
      This patch reworks the entire code and uses mul_u64_u32_shr() as
      proposed by Andy a long while ago.
      
      We express our virtual time scale factor in a u32 multiplier and shift
      right and the 32bit mul_u64_u32_shr() implementation reduces to a
      single 32x32->64 multiply if the time delta is still short (common
      case).
      
      For 64bit a 64x64->128 multiply can be used if ARCH_SUPPORTS_INT128.
      Reported-and-Tested-by: NChristian Engelmayer <cengelma@gmx.at>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: fweisbec@gmail.com
      Cc: Paul Turner <pjt@google.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20131118172706.GI3866@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9dbdb155
  10. 05 12月, 2013 1 次提交
  11. 27 11月, 2013 2 次提交
  12. 20 11月, 2013 1 次提交
  13. 13 11月, 2013 3 次提交