1. 20 6月, 2008 3 次提交
  2. 19 6月, 2008 2 次提交
  3. 18 6月, 2008 1 次提交
    • D
      sched: rework of "prioritize non-migratable tasks over migratable ones" · 20b6331b
      Dmitry Adamushko 提交于
      regarding this commit: 45c01e82
      
      I think we can do it simpler. Please take a look at the patch below.
      
      Instead of having 2 separate arrays (which is + ~800 bytes on x86_32 and
      twice so on x86_64), let's add "exclusive" (the ones that are bound to
      this CPU) tasks to the head of the queue and "shared" ones -- to the
      end.
      
      In case of a few newly woken up "exclusive" tasks, they are 'stacked'
      (not queued as now), meaning that a task {i+1} is being placed in front
      of the previously woken up task {i}. But I don't think that this
      behavior may cause any realistic problems.
      
      There are a couple of changes on top of this one.
      
      (1) in check_preempt_curr_rt()
      
      I don't think there is a need for the "pick_next_rt_entity(rq, &rq->rt)
      != &rq->curr->rt" check.
      
      enqueue_task_rt(p) and check_preempt_curr_rt() are always called one
      after another with rq->lock being held so the following check
      "p->rt.nr_cpus_allowed == 1 && rq->curr->rt.nr_cpus_allowed != 1" should
      be enough (well, just its left part) to guarantee that 'p' has been
      queued in front of the 'curr'.
      
      (2) in set_cpus_allowed_rt()
      
      I don't thinks there is a need for requeue_task_rt() here.
      
      Perhaps, the only case when 'requeue' (+ reschedule) might be useful is
      as follows:
      
      i) weight == 1 && cpu_isset(task_cpu(p), *new_mask)
      
      i.e. a task is being bound to this CPU);
      
      ii) 'p' != rq->curr
      
      but here, 'p' has already been on this CPU for a while and was not
      migrated. i.e. it's possible that 'rq->curr' would not have high chances
      to be migrated right at this particular moment (although, has chance in
      a bit longer term), should we allow it to be preempted.
      
      Anyway, I think we should not perhaps make it more complex trying to
      address some rare corner cases. For instance, that's why a single queue
      approach would be preferable. Unless I'm missing something obvious, this
      approach gives us similar functionality at lower cost.
      
      Verified only compilation-wise.
      
      (Almost)-Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      20b6331b
  4. 10 6月, 2008 1 次提交
    • P
      sched: fix hotplug cpus on ia64 · 7def2be1
      Peter Zijlstra 提交于
      Cliff Wickman wrote:
      
      > I built an ia64 kernel from Andrew's tree (2.6.26-rc2-mm1)
      > and get a very predictable hotplug cpu problem.
      > billberry1:/tmp/cpw # ./dis
      > disabled cpu 17
      > enabled cpu 17
      > billberry1:/tmp/cpw # ./dis
      > disabled cpu 17
      > enabled cpu 17
      > billberry1:/tmp/cpw # ./dis
      >
      > The script that disables the cpu always hangs (unkillable)
      > on the 3rd attempt.
      >
      > And a bit further:
      > The kstopmachine thread always sits on the run queue (real time) for about
      > 30 minutes before running.
      
      this fix solves some (but not all) issues between CPU hotplug and
      RT bandwidth throttling.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7def2be1
  5. 06 6月, 2008 4 次提交
    • I
      sched: fix cpuprio build bug · 1100ac91
      Ingo Molnar 提交于
      this patch was not built on !SMP:
      
       kernel/sched_rt.c: In function 'inc_rt_tasks':
       kernel/sched_rt.c:404: error: 'struct rq' has no member named 'online'
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1100ac91
    • G
      sched: fix cpupri hotplug support · 1f11eb6a
      Gregory Haskins 提交于
      The RT folks over at RedHat found an issue w.r.t. hotplug support which
      was traced to problems with the cpupri infrastructure in the scheduler:
      
      https://bugzilla.redhat.com/show_bug.cgi?id=449676
      
      This bug affects 23-rt12+, 24-rtX, 25-rtX, and sched-devel.  This patch
      applies to 25.4-rt4, though it should trivially apply to most cpupri enabled
      kernels mentioned above.
      
      It turned out that the issue was that offline cpus could get inadvertently
      registered with cpupri so that they were erroneously selected during
      migration decisions.  The end result would be an OOPS as the offline cpu
      had tasks routed to it.
      
      This patch generalizes the old join/leave domain interface into an
      online/offline interface, and adjusts the root-domain/hotplug code to
      utilize it.
      
      I was able to easily reproduce the issue prior to this patch, and am no
      longer able to reproduce it after this patch.  I can offline cpus
      indefinately and everything seems to be in working order.
      
      Thanks to Arnaldo (acme), Thomas, and Peter for doing the legwork to point
      me in the right direction.  Also thank you to Peter for reviewing the
      early iterations of this patch.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1f11eb6a
    • G
      sched: use a 2-d bitmap for searching lowest-pri CPU · 6e0534f2
      Gregory Haskins 提交于
      The current code use a linear algorithm which causes scaling issues
      on larger SMP machines.  This patch replaces that algorithm with a
      2-dimensional bitmap to reduce latencies in the wake-up path.
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      Acked-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      6e0534f2
    • G
      sched: prioritize non-migratable tasks over migratable ones · 45c01e82
      Gregory Haskins 提交于
      Dmitry Adamushko pointed out a known flaw in the rt-balancing algorithm
      that could allow suboptimal balancing if a non-migratable task gets
      queued behind a running migratable one.  It is discussed in this thread:
      
      http://lkml.org/lkml/2008/4/22/296
      
      This issue has been further exacerbated by a recent checkin to
      sched-devel (git-id 5eee63a5ebc19a870ac40055c0be49457f3a89a3).
      
      >From a pure priority standpoint, the run-queue is doing the "right"
      thing. Using Dmitry's nomenclature, if T0 is on cpu1 first, and T1
      wakes up at equal or lower priority (affined only to cpu1) later, it
      *should* wait for T0 to finish.  However, in reality that is likely
      suboptimal from a system perspective if there are other cores that
      could allow T0 and T1 to run concurrently.  Since T1 can not migrate,
      the only choice for higher concurrency is to try to move T0.  This is
      not something we addessed in the recent rt-balancing re-work.
      
      This patch tries to enhance the balancing algorithm by accomodating this
      scenario.  It accomplishes this by incorporating the migratability of a
      task into its priority calculation.  Within a numerical tsk->prio, a
      non-migratable task is logically higher than a migratable one.  We
      maintain this by introducing a new per-priority queue (xqueue, or
      exclusive-queue) for holding non-migratable tasks.  The scheduler will
      draw from the xqueue over the standard shared-queue (squeue) when
      available.
      
      There are several details for utilizing this properly.
      
      1) During task-wake-up, we not only need to check if the priority
         preempts the current task, but we also need to check for this
         non-migratable condition.  Therefore, if a non-migratable task wakes
         up and sees an equal priority migratable task already running, it
         will attempt to preempt it *if* there is a likelyhood that the
         current task will find an immediate home.
      
      2) Tasks only get this non-migratable "priority boost" on wake-up.  Any
         requeuing will result in the non-migratable task being queued to the
         end of the shared queue.  This is an attempt to prevent the system
         from being completely unfair to migratable tasks during things like
         SCHED_RR timeslicing.
      
      I am sure this patch introduces potentially "odd" behavior if you
      concoct a scenario where a bunch of non-migratable threads could starve
      migratable ones given the right pattern.  I am not yet convinced that
      this is a problem since we are talking about tasks of equal RT priority
      anyway, and there never is much in the way of guarantees against
      starvation under that scenario anyway. (e.g. you could come up with a
      similar scenario with a specific timing environment verses an affinity
      environment).  I can be convinced otherwise, but for now I think this is
      "ok".
      Signed-off-by: NGregory Haskins <ghaskins@novell.com>
      CC: Dmitry Adamushko <dmitry.adamushko@gmail.com>
      CC: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      45c01e82
  6. 29 5月, 2008 1 次提交
  7. 06 5月, 2008 2 次提交
  8. 20 4月, 2008 6 次提交
  9. 07 3月, 2008 1 次提交
    • S
      sched: balance RT task resched only on runqueue · 6fa46fa5
      Steven Rostedt 提交于
      Sripathi Kodi reported a crash in the -rt kernel:
      
        https://bugzilla.redhat.com/show_bug.cgi?id=435674
      
      this is due to a place that can reschedule a task without holding
      the tasks runqueue lock.  This was caused by the RT balancing code
      that pulls RT tasks to the current run queue and will reschedule the
      current task.
      
      There's a slight chance that the pulling of the RT tasks will release
      the current runqueue's lock and retake it (in the double_lock_balance).
      During this time that the runqueue is released, the current task can
      migrate to another runqueue.
      
      In the prio_changed_rt code, after the pull, if the current task is of
      lesser priority than one of the RT tasks pulled, resched_task is called
      on the current task. If the current task had migrated in that small
      window, resched_task will be called without holding the runqueue lock
      for the runqueue that the task is on.
      
      This race condition also exists in the mainline kernel and this patch
      adds a check to make sure the task hasn't migrated before calling
      resched_task.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Tested-by: NSripathi Kodi <sripathik@in.ibm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6fa46fa5
  10. 05 3月, 2008 1 次提交
    • P
      sched: revert load_balance_monitor() changes · 62fb1851
      Peter Zijlstra 提交于
      The following commits cause a number of regressions:
      
        commit 58e2d4ca
        Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
        Date:   Fri Jan 25 21:08:00 2008 +0100
        sched: group scheduling, change how cpu load is calculated
      
        commit 6b2d7700
        Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
        Date:   Fri Jan 25 21:08:00 2008 +0100
        sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups
      
      Namely:
       - very frequent wakeups on SMP, reported by PowerTop users.
       - cacheline trashing on (large) SMP
       - some latencies larger than 500ms
      
      While there is a mergeable patch to fix the latter, the former issues
      are not fixable in a manner suitable for .25 (we're at -rc3 now).
      
      Hence we revert them and try again in v2.6.26.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Tested-by: NAlexey Zaytsev <alexey.zaytsev@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      62fb1851
  11. 13 2月, 2008 3 次提交
  12. 26 1月, 2008 15 次提交