1. 09 12月, 2009 1 次提交
  2. 21 9月, 2009 1 次提交
  3. 15 9月, 2009 3 次提交
  4. 15 5月, 2009 1 次提交
    • T
      sched, timers: move calc_load() to scheduler · dce48a84
      Thomas Gleixner 提交于
      Dimitri Sivanich noticed that xtime_lock is held write locked across
      calc_load() which iterates over all online CPUs. That can cause long
      latencies for xtime_lock readers on large SMP systems. 
      
      The load average calculation is an rough estimate anyway so there is
      no real need to protect the readers vs. the update. It's not a problem
      when the avenrun array is updated while a reader copies the values.
      
      Instead of iterating over all online CPUs let the scheduler_tick code
      update the number of active tasks shortly before the avenrun update
      happens. The avenrun update itself is handled by the CPU which calls
      do_timer().
      
      [ Impact: reduce xtime_lock write locked section ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      dce48a84
  5. 22 10月, 2008 1 次提交
  6. 22 9月, 2008 1 次提交
  7. 06 5月, 2008 1 次提交
  8. 26 1月, 2008 3 次提交
    • P
      sched: high-res preemption tick · 8f4d37ec
      Peter Zijlstra 提交于
      Use HR-timers (when available) to deliver an accurate preemption tick.
      
      The regular scheduler tick that runs at 1/HZ can be too coarse when nice
      level are used. The fairness system will still keep the cpu utilisation 'fair'
      by then delaying the task that got an excessive amount of CPU time but try to
      minimize this by delivering preemption points spot-on.
      
      The average frequency of this extra interrupt is sched_latency / nr_latency.
      Which need not be higher than 1/HZ, its just that the distribution within the
      sched_latency period is important.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8f4d37ec
    • S
      sched: RT-balance, add new methods to sched_class · cb469845
      Steven Rostedt 提交于
      Dmitry Adamushko found that the current implementation of the RT
      balancing code left out changes to the sched_setscheduler and
      rt_mutex_setprio.
      
      This patch addresses this issue by adding methods to the schedule classes
      to handle being switched out of (switched_from) and being switched into
      (switched_to) a sched_class. Also a method for changing of priorities
      is also added (prio_changed).
      
      This patch also removes some duplicate logic between rt_mutex_setprio and
      sched_setscheduler.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cb469845
    • G
      sched: de-SCHED_OTHER-ize the RT path · e7693a36
      Gregory Haskins 提交于
      The current wake-up code path tries to determine if it can optimize the
      wake-up to "this_cpu" by computing load calculations.  The problem is that
      these calculations are only relevant to SCHED_OTHER tasks where load is king.
      For RT tasks, priority is king.  So the load calculation is completely wasted
      bandwidth.
      
      Therefore, we create a new sched_class interface to help with
      pre-wakeup routing decisions and move the load calculation as a function
      of CFS task's class.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e7693a36
  9. 25 10月, 2007 2 次提交
    • P
      sched: isolate SMP balancing code a bit more · 681f3e68
      Peter Williams 提交于
      At the moment, a lot of load balancing code that is irrelevant to non
      SMP systems gets included during non SMP builds.
      
      This patch addresses this issue and reduces the binary size on non
      SMP systems:
      
         text    data     bss     dec     hex filename
        10983      28    1192   12203    2fab sched.o.before
        10739      28    1192   11959    2eb7 sched.o.after
      Signed-off-by: NPeter Williams <pwil3058@bigpond.net.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      681f3e68
    • P
      sched: reduce balance-tasks overhead · e1d1484f
      Peter Williams 提交于
      At the moment, balance_tasks() provides low level functionality for both
        move_tasks() and move_one_task() (indirectly) via the load_balance()
      function (in the sched_class interface) which also provides dual
      functionality.  This dual functionality complicates the interfaces and
      internal mechanisms and makes the run time overhead of operations that
      are called with two run queue locks held.
      
      This patch addresses this issue and reduces the overhead of these
      operations.
      Signed-off-by: NPeter Williams <pwil3058@bigpond.net.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e1d1484f
  10. 15 10月, 2007 4 次提交
  11. 09 8月, 2007 5 次提交
    • I
      sched: remove the 'u64 now' parameter from ->put_prev_task() · 31ee529c
      Ingo Molnar 提交于
      remove the 'u64 now' parameter from ->put_prev_task().
      
      ( identity transformation that causes no change in functionality. )
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      31ee529c
    • I
      sched: remove the 'u64 now' parameter from ->pick_next_task() · fb8d4724
      Ingo Molnar 提交于
      remove the 'u64 now' parameter from ->pick_next_task().
      
      ( identity transformation that causes no change in functionality. )
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fb8d4724
    • I
      sched: remove the 'u64 now' parameter from ->dequeue_task() · f02231e5
      Ingo Molnar 提交于
      remove the 'u64 now' parameter from ->dequeue_task().
      
      ( identity transformation that causes no change in functionality. )
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f02231e5
    • P
      sched: fix bug in balance_tasks() · a4ac01c3
      Peter Williams 提交于
      There are two problems with balance_tasks() and how it used:
      
      1. The variables best_prio and best_prio_seen (inherited from the old
      move_tasks()) were only required to handle problems caused by the
      active/expired arrays, the order in which they were processed and the
      possibility that the task with the highest priority could be on either.
        These issues are no longer present and the extra overhead associated
      with their use is unnecessary (and possibly wrong).
      
      2. In the absence of CONFIG_FAIR_GROUP_SCHED being set, the same
      this_best_prio variable needs to be used by all scheduling classes or
      there is a risk of moving too much load.  E.g. if the highest priority
      task on this at the beginning is a fairly low priority task and the rt
      class migrates a task (during its turn) then that moved task becomes the
      new highest priority task on this_rq but when the sched_fair class
      initializes its copy of this_best_prio it will get the priority of the
      original highest priority task as, due to the run queue locks being
      held, the reschedule triggered by pull_task() will not have taken place.
        This could result in inappropriate overriding of skip_for_load and
      excessive load being moved.
      
      The attached patch addresses these problems by deleting all reference to
      best_prio and best_prio_seen and making this_best_prio a reference
      parameter to the various functions involved.
      
      load_balance_fair() has also been modified so that this_best_prio is
      only reset (in the loop) if CONFIG_FAIR_GROUP_SCHED is set.  This should
      preserve the effect of helping spread groups' higher priority tasks
      around the available CPUs while improving system performance when
      CONFIG_FAIR_GROUP_SCHED isn't set.
      Signed-off-by: NPeter Williams <pwil3058@bigpond.net.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a4ac01c3
    • P
      sched: simplify move_tasks() · 43010659
      Peter Williams 提交于
      The move_tasks() function is currently multiplexed with two distinct
      capabilities:
      
      1. attempt to move a specified amount of weighted load from one run
      queue to another; and
      2. attempt to move a specified number of tasks from one run queue to
      another.
      
      The first of these capabilities is used in two places, load_balance()
      and load_balance_idle(), and in both of these cases the return value of
      move_tasks() is used purely to decide if tasks/load were moved and no
      notice of the actual number of tasks moved is taken.
      
      The second capability is used in exactly one place,
      active_load_balance(), to attempt to move exactly one task and, as
      before, the return value is only used as an indicator of success or failure.
      
      This multiplexing of sched_task() was introduced, by me, as part of the
      smpnice patches and was motivated by the fact that the alternative, one
      function to move specified load and one to move a single task, would
      have led to two functions of roughly the same complexity as the old
      move_tasks() (or the new balance_tasks()).  However, the new modular
      design of the new CFS scheduler allows a simpler solution to be adopted
      and this patch addresses that solution by:
      
      1. adding a new function, move_one_task(), to be used by
      active_load_balance(); and
      2. making move_tasks() a single purpose function that tries to move a
      specified weighted load and returns 1 for success and 0 for failure.
      
      One of the consequences of these changes is that neither move_one_task()
      or the new move_tasks() care how many tasks sched_class.load_balance()
      moves and this enables its interface to be simplified by returning the
      amount of load moved as its result and removing the load_moved pointer
      from the argument list.  This helps simplify the new move_tasks() and
      slightly reduces the amount of work done in each of
      sched_class.load_balance()'s implementations.
      
      Further simplification, e.g. changes to balance_tasks(), are possible
      but (slightly) complicated by the special needs of load_balance_fair()
      so I've left them to a later patch (if this one gets accepted).
      
      NB Since move_tasks() gets called with two run queue locks held even
      small reductions in overhead are worthwhile.
      
      [ mingo@elte.hu ]
      
      this change also reduces code size nicely:
      
         text    data     bss     dec     hex filename
         39216    3618      24   42858    a76a sched.o.before
         39173    3618      24   42815    a73f sched.o.after
      Signed-off-by: NPeter Williams <pwil3058@bigpond.net.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      43010659
  12. 10 7月, 2007 1 次提交
    • I
      sched: cfs core, kernel/sched_idletask.c · fa72e9e4
      Ingo Molnar 提交于
      add kernel/sched_idletask.c - which implements the idle thread
      scheduling class. This further simplifies sched.c (under CFS),
      for example a number of 'if (p == rq->idle)' type of special-cases
      can be removed from sched.c, and schedule() gets simpler too.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa72e9e4