1. 29 12月, 2008 11 次提交
    • G
      sched: create "pushable_tasks" list to limit pushing to one attempt · 917b627d
      Gregory Haskins 提交于
      The RT scheduler employs a "push/pull" design to actively balance tasks
      within the system (on a per disjoint cpuset basis).  When a task is
      awoken, it is immediately determined if there are any lower priority
      cpus which should be preempted.  This is opposed to the way normal
      SCHED_OTHER tasks behave, which will wait for a periodic rebalancing
      operation to occur before spreading out load.
      
      When a particular RQ has more than 1 active RT task, it is said to
      be in an "overloaded" state.  Once this occurs, the system enters
      the active balancing mode, where it will try to push the task away,
      or persuade a different cpu to pull it over.  The system will stay
      in this state until the system falls back below the <= 1 queued RT
      task per RQ.
      
      However, the current implementation suffers from a limitation in the
      push logic.  Once overloaded, all tasks (other than current) on the
      RQ are analyzed on every push operation, even if it was previously
      unpushable (due to affinity, etc).  Whats more, the operation stops
      at the first task that is unpushable and will not look at items
      lower in the queue.  This causes two problems:
      
      1) We can have the same tasks analyzed over and over again during each
         push, which extends out the fast path in the scheduler for no
         gain.  Consider a RQ that has dozens of tasks that are bound to a
         core.  Each one of those tasks will be encountered and skipped
         for each push operation while they are queued.
      
      2) There may be lower-priority tasks under the unpushable task that
         could have been successfully pushed, but will never be considered
         until either the unpushable task is cleared, or a pull operation
         succeeds.  The net result is a potential latency source for mid
         priority tasks.
      
      This patch aims to rectify these two conditions by introducing a new
      priority sorted list: "pushable_tasks".  A task is added to the list
      each time a task is activated or preempted.  It is removed from the
      list any time it is deactivated, made current, or fails to push.
      
      This works because a task only needs to be attempted to push once.
      After an initial failure to push, the other cpus will eventually try to
      pull the task when the conditions are proper.  This also solves the
      problem that we don't completely analyze all tasks due to encountering
      an unpushable tasks.  Now every task will have a push attempted (when
      appropriate).
      
      This reduces latency both by shorting the critical section of the
      rq->lock for certain workloads, and by making sure the algorithm
      considers all eligible tasks in the system.
      
      [ rostedt: added a couple more BUG_ONs ]
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      Acked-by: NSteven Rostedt <srostedt@redhat.com>
      917b627d
    • G
      plist: fix PLIST_NODE_INIT to work with debug enabled · 4075134e
      Gregory Haskins 提交于
      It seems that PLIST_NODE_INIT breaks if used and DEBUG_PI_LIST is defined.
      Since there are no current users of PLIST_NODE_INIT, this has gone
      undetected.  This patch fixes the build issue that enables the
      DEBUG_PI_LIST later in the series when we use it in init_task.h
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      4075134e
    • G
      sched: add sched_class->needs_post_schedule() member · 967fc046
      Gregory Haskins 提交于
      We currently run class->post_schedule() outside of the rq->lock, which
      means that we need to test for the need to post_schedule outside of
      the lock to avoid a forced reacquistion.  This is currently not a problem
      as we only look at rq->rt.overloaded.  However, we want to enhance this
      going forward to look at more state to reduce the need to post_schedule to
      a bare minimum set.  Therefore, we introduce a new member-func called
      needs_post_schedule() which tests for the post_schedule condtion without
      actually performing the work.  Therefore it is safe to call this
      function before the rq->lock is released, because we are guaranteed not
      to drop the lock at an intermediate point (such as what post_schedule()
      may do).
      
      We will use this later in the series
      
      [ rostedt: removed paranoid BUG_ON ]
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      967fc046
    • G
      sched: make double-lock-balance fair · 8f45e2b5
      Gregory Haskins 提交于
      double_lock balance() currently favors logically lower cpus since they
      often do not have to release their own lock to acquire a second lock.
      The result is that logically higher cpus can get starved when there is
      a lot of pressure on the RQs.  This can result in higher latencies on
      higher cpu-ids.
      
      This patch makes the algorithm more fair by forcing all paths to have
      to release both locks before acquiring them again.  Since callsites to
      double_lock_balance already consider it a potential preemption/reschedule
      point, they have the proper logic to recheck for atomicity violations.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      8f45e2b5
    • G
      sched: pull only one task during NEWIDLE balancing to limit critical section · 7e96fa58
      Gregory Haskins 提交于
      git-id c4acb2c0 attempted to limit
      newidle critical section length by stopping after at least one task
      was moved.  Further investigation has shown that there are other
      paths nested further inside the algorithm which still remain that allow
      long latencies to occur with newidle balancing.  This patch applies
      the same technique inside balance_tasks() to limit the duration of
      this optional balancing operation.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      CC: Nick Piggin <npiggin@suse.de>
      7e96fa58
    • G
      sched: only try to push a task on wakeup if it is migratable · 777c2f38
      Gregory Haskins 提交于
      There is no sense in wasting time trying to push a task away that
      cannot move anywhere else.  We gain no benefit from trying to push
      other tasks at this point, so if the task being woken up is non
      migratable, just skip the whole operation.  This reduces overhead
      in the wakeup path for certain tasks.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      777c2f38
    • G
      sched: use highest_prio.next to optimize pull operations · 74ab8e4f
      Gregory Haskins 提交于
      We currently take the rq->lock for every cpu in an overload state during
      pull_rt_tasks().  However, we now have enough information via the
      highest_prio.[curr|next] fields to determine if there is any tasks of
      interest to warrant the overhead of the rq->lock, before we actually take
      it.  So we use this information to reduce lock contention during the
      pull for the case where the source-rq doesnt have tasks that preempt
      the current task.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      74ab8e4f
    • G
      sched: use highest_prio.curr for pull threshold · a8728944
      Gregory Haskins 提交于
      highest_prio.curr is actually a more accurate way to keep track of
      the pull_rt_task() threshold since it is always up to date, even
      if the "next" task migrates during double_lock.  Therefore, stop
      looking at the "next" task object and simply use the highest_prio.curr.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      a8728944
    • G
      sched: track the next-highest priority on each runqueue · e864c499
      Gregory Haskins 提交于
      We will use this later in the series to reduce the amount of rq-lock
      contention during a pull operation
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      e864c499
    • G
      sched: cleanup inc/dec_rt_tasks · 4d984277
      Gregory Haskins 提交于
      Move some common definitions up to the function prologe to simplify the
      body logic.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      4d984277
    • S
      x86: mark get_cpu_leaves() with __cpuinit annotation · 6092848a
      Sergio Luis 提交于
      Impact: fix section mismatch warning
      
      Commit b2bb8554 ("x86: Remove cpumask games
      in x86/kernel/cpu/intel_cacheinfo.c") introduced get_cpu_leaves(), which
      references __cpuinit cpuid4_cache_lookup().
      
      Mark get_cpu_leaves() with a __cpuinit annotation.
      Signed-off-by: NSergio Luis <sergio@larces.uece.br>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6092848a
  2. 24 12月, 2008 4 次提交
  3. 19 12月, 2008 10 次提交
  4. 18 12月, 2008 15 次提交