1. 15 10月, 2007 8 次提交
  2. 10 10月, 2007 1 次提交
  3. 20 9月, 2007 2 次提交
    • H
      sched: fix invalid sched_class use · 9c95e731
      Hiroshi Shimamoto 提交于
      When using rt_mutex, a NULL pointer dereference is occurred at
      enqueue_task_rt. Here is a scenario;
      1) there are two threads, the thread A is fair_sched_class and
         thread B is rt_sched_class.
      2) Thread A is boosted up to rt_sched_class, because the thread A
         has a rt_mutex lock and the thread B is waiting the lock.
      3) At this time, when thread A create a new thread C, the thread
         C has a rt_sched_class.
      4) When doing wake_up_new_task() for the thread C, the priority
         of the thread C is out of the RT priority range, because the
         normal priority of thread A is not the RT priority. It makes
         data corruption by overflowing the rt_prio_array.
      The new thread C should be fair_sched_class.
      
      The new thread should be valid scheduler class before queuing.
      This patch fixes to set the suitable scheduler class.
      Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      9c95e731
    • I
      sched: add /proc/sys/kernel/sched_compat_yield · 1799e35d
      Ingo Molnar 提交于
      add /proc/sys/kernel/sched_compat_yield to make sys_sched_yield()
      more agressive, by moving the yielding task to the last position
      in the rbtree.
      
      with sched_compat_yield=0:
      
         PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
        2539 mingo     20   0  1576  252  204 R   50  0.0   0:02.03 loop_yield
        2541 mingo     20   0  1576  244  196 R   50  0.0   0:02.05 loop
      
      with sched_compat_yield=1:
      
         PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
        2584 mingo     20   0  1576  248  196 R   99  0.0   0:52.45 loop
        2582 mingo     20   0  1576  256  204 R    0  0.0   0:00.00 loop_yield
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      1799e35d
  4. 05 9月, 2007 3 次提交
  5. 28 8月, 2007 1 次提交
    • I
      sched: make the scheduler converge to the ideal latency · f6cf891c
      Ingo Molnar 提交于
      de-HZ-ification of the granularity defaults unearthed a pre-existing
      property of CFS: while it correctly converges to the granularity goal,
      it does not prevent run-time fluctuations in the range of
      [-gran ... 0 ... +gran].
      
      With the increase of the granularity due to the removal of HZ
      dependencies, this becomes visible in chew-max output (with 5 tasks
      running):
      
       out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40
       out:  27 . 27. 32 | flu:  0 .  0 | ran:   17 .   13 | per:   44 .   40
       out:  27 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   36 .   40
       out:  29 . 27. 32 | flu:  2 .  0 | ran:   17 .   13 | per:   46 .   40
       out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40
       out:  29 . 27. 32 | flu:  0 .  0 | ran:   18 .   13 | per:   47 .   40
       out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40
      
      average slice is the ideal 13 msecs and the period is picture-perfect 40
      msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
      mechanism in CFS to keep that from happening: it's a perfectly valid
      solution that CFS finds.
      
      to fix this we add a granularity/preemption rule that knows about
      the "target latency", which makes tasks that run longer than the ideal
      latency run a bit less. The simplest approach is to simply decrease the
      preemption granularity when a task overruns its ideal latency. For this
      we have to track how much the task executed since its last preemption.
      
      ( this adds a new field to task_struct, but we can eliminate that
        overhead in 2.6.24 by putting all the scheduler timestamps into an
        anonymous union. )
      
      with this change in place, chew-max output is fluctuation-less all
      around:
      
       out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
       out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
       out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
       out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
       out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40
       out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40
      
      this patch has no impact on any fastpath or on any globally observable
      scheduling property. (unless you have sharp enough eyes to see
      millisecond-level ruckles in glxgears smoothness :-)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      f6cf891c
  6. 26 8月, 2007 3 次提交
  7. 25 8月, 2007 2 次提交
  8. 23 8月, 2007 5 次提交
    • I
      sched: tweak the sched_runtime_limit tunable · 505c0efd
      Ingo Molnar 提交于
      Michael Gerdau reported reniced task CPU usage weirdnesses.
      Such symptoms can be caused by limit underruns so double the
      sched_runtime_limit.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      505c0efd
    • S
      sched: skip updating rq's next_balance under null SD · f549da84
      Suresh Siddha 提交于
      Was playing with sched_smt_power_savings/sched_mc_power_savings and
      found out that while the scheduler domains are reconstructed when sysfs
      settings change, rebalance_domains() can get triggered with null domain
      on other cpus, which is setting next_balance to jiffies + 60*HZ.
      Resulting in no idle/busy balancing for 60 seconds.
      
      Fix this.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f549da84
    • S
      sched: fix broken SMT/MC optimizations · f8700df7
      Suresh Siddha 提交于
      On a four package system with HT - HT load balancing optimizations were
      broken.  For example, if two tasks end up running on two logical threads
      of one of the packages, scheduler is not able to pull one of the tasks
      to a completely idle package.
      
      In this scenario, for nice-0 tasks, imbalance calculated by scheduler
      will be 512 and find_busiest_queue() will return 0 (as each cpu's load
      is 1024 > imbalance and has only one task running).
      
      Similarly MC scheduler optimizations also get fixed with this patch.
      
      [ mingo@elte.hu: restored fair balancing by increasing the fuzz and
                       adding it back to the power decision, without the /2
                       factor. ]
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f8700df7
    • E
      sched: fix sysctl directory permissions · c57baf1e
      Eric W. Biederman 提交于
      There are two remaining gotchas:
      
      - The directories have impossible permissions (writeable).
      
      - The ctl_name for the kernel directory is inconsistent with
        everything else.  It should be CTL_KERN.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c57baf1e
    • I
      sched: sched_clock_idle_[sleep|wakeup]_event() · 2aa44d05
      Ingo Molnar 提交于
      construct a more or less wall-clock time out of sched_clock(), by
      using ACPI-idle's existing knowledge about how much time we spent
      idling. This allows the rq clock to work around TSC-stops-in-C2,
      TSC-gets-corrupted-in-C3 type of problems.
      
      ( Besides the scheduler's statistics this also benefits blktrace and
        printk-timestamps as well. )
      
      Furthermore, the precise before-C2/C3-sleep and after-C2/C3-wakeup
      callbacks allow the scheduler to get out the most of the period where
      the CPU has a reliable TSC. This results in slightly more precise
      task statistics.
      
      the ACPI bits were acked by Len.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NLen Brown <len.brown@intel.com>
      2aa44d05
  9. 13 8月, 2007 2 次提交
  10. 11 8月, 2007 1 次提交
  11. 09 8月, 2007 12 次提交