1. 28 6月, 2006 4 次提交
  2. 27 6月, 2006 2 次提交
    • A
      [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status · 495ab9c0
      Andi Kleen 提交于
      During some profiling I noticed that default_idle causes a lot of
      memory traffic. I think that is caused by the atomic operations
      to clear/set the polling flag in thread_info. There is actually
      no reason to make this atomic - only the idle thread does it
      to itself, other CPUs only read it. So I moved it into ti->status.
      
      Converted i386/x86-64/ia64 for now because that was the easiest
      way to fix ACPI which also manipulates these flags in its idle
      function.
      
      Cc: Nick Piggin <npiggin@novell.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      495ab9c0
    • P
      [PATCH] sched: fix SCHED_FIFO bug in sys_sched_rr_get_interval() · b78709cf
      Peter Williams 提交于
      The introduction of SCHED_BATCH scheduling class with a value of 3 means
      that the expression (p->policy & SCHED_FIFO) will return true if policy
      is SCHED_BATCH or SCHED_FIFO.
      
      Unfortunately, this expression is used in sys_sched_rr_get_interval()
      and in the absence of a comment to say that this is intentional I
      presume that it is unintentional and erroneous.
      
      The fix is to change the expression to (p->policy == SCHED_FIFO).
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b78709cf
  3. 26 6月, 2006 2 次提交
  4. 23 6月, 2006 2 次提交
  5. 22 5月, 2006 1 次提交
  6. 26 4月, 2006 1 次提交
  7. 11 4月, 2006 2 次提交
  8. 01 4月, 2006 7 次提交
  9. 29 3月, 2006 1 次提交
  10. 28 3月, 2006 4 次提交
    • S
      [PATCH] sched: fix group power for allnodes_domains · 08069033
      Siddha, Suresh B 提交于
      Current sched groups power calculation for allnodes_domains is wrong.  We
      should really be using cumulative power of the physical packages in that
      group (similar to the calculation in node_domains)
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      08069033
    • S
      [PATCH] sched: new sched domain for representing multi-core · 1e9f28fa
      Siddha, Suresh B 提交于
      Add a new sched domain for representing multi-core with shared caches
      between cores.  Consider a dual package system, each package containing two
      cores and with last level cache shared between cores with in a package.  If
      there are two runnable processes, with this appended patch those two
      processes will be scheduled on different packages.
      
      On such systems, with this patch we have observed 8% perf improvement with
      specJBB(2 warehouse) benchmark and 35% improvement with CFP2000 rate(with 2
      users).
      
      This new domain will come into play only on multi-core systems with shared
      caches.  On other systems, this sched domain will be removed by domain
      degeneration code.  This new domain can be also used for implementing power
      savings policy (see OLS 2005 CMP kernel scheduler paper for more details..
      I will post another patch for power savings policy soon)
      
      Most of the arch/* file changes are for cpu_coregroup_map() implementation.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1e9f28fa
    • A
      [PATCH] Small schedule() optimization · 77e4bfbc
      Andreas Mohr 提交于
      small schedule() microoptimization.
      Signed-off-by: NAndreas Mohr <andi@lisas.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      77e4bfbc
    • M
      [PATCH] sched: fix task interactivity calculation · 013d3868
      Martin Andersson 提交于
      Is a truncation error in kernel/sched.c triggered when the nice value is
      negative.  The affected code is used in the TASK_INTERACTIVE macro.
      
      The code is:
      #define SCALE(v1,v1_max,v2_max) \
      	(v1) * (v2_max) / (v1_max)
      
      which is used in this way:
      SCALE(TASK_NICE(p), 40, MAX_BONUS)
      
      Comments in the code says:
        * This part scales the interactivity limit depending on niceness.
        *
        * We scale it linearly, offset by the INTERACTIVE_DELTA delta.
        * Here are a few examples of different nice levels:
        *
        *  TASK_INTERACTIVE(-20): [1,1,1,1,1,1,1,1,1,0,0]
        *  TASK_INTERACTIVE(-10): [1,1,1,1,1,1,1,0,0,0,0]
        *  TASK_INTERACTIVE(  0): [1,1,1,1,0,0,0,0,0,0,0]
        *  TASK_INTERACTIVE( 10): [1,1,0,0,0,0,0,0,0,0,0]
        *  TASK_INTERACTIVE( 19): [0,0,0,0,0,0,0,0,0,0,0]
        *
        * (the X axis represents the possible -5 ... 0 ... +5 dynamic
        *  priority range a task can explore, a value of '1' means the
        *  task is rated interactive.)
      
      However, the current code does not scale it linearly and the result differs
      from the given examples.  If the mathematical function "floor" is used when
      the nice value is negative instead of the truncation one gets when using
      integer division, the result conforms to the documentation.
      
      Output of TASK_INTERACTIVE when using the kernel code:
      nice    dynamic priorities
      -20     1     1     1     1     1     1     1     1     1     0     0
      -19     1     1     1     1     1     1     1     1     0     0     0
      -18     1     1     1     1     1     1     1     1     0     0     0
      -17     1     1     1     1     1     1     1     1     0     0     0
      -16     1     1     1     1     1     1     1     1     0     0     0
      -15     1     1     1     1     1     1     1     0     0     0     0
      -14     1     1     1     1     1     1     1     0     0     0     0
      -13     1     1     1     1     1     1     1     0     0     0     0
      -12     1     1     1     1     1     1     1     0     0     0     0
      -11     1     1     1     1     1     1     0     0     0     0     0
      -10     1     1     1     1     1     1     0     0     0     0     0
        -9     1     1     1     1     1     1     0     0     0     0     0
        -8     1     1     1     1     1     1     0     0     0     0     0
        -7     1     1     1     1     1     0     0     0     0     0     0
        -6     1     1     1     1     1     0     0     0     0     0     0
        -5     1     1     1     1     1     0     0     0     0     0     0
        -4     1     1     1     1     1     0     0     0     0     0     0
        -3     1     1     1     1     0     0     0     0     0     0     0
        -2     1     1     1     1     0     0     0     0     0     0     0
        -1     1     1     1     1     0     0     0     0     0     0     0
        0      1     1     1     1     0     0     0     0     0     0     0
        1      1     1     1     1     0     0     0     0     0     0     0
        2      1     1     1     1     0     0     0     0     0     0     0
        3      1     1     1     1     0     0     0     0     0     0     0
        4      1     1     1     0     0     0     0     0     0     0     0
        5      1     1     1     0     0     0     0     0     0     0     0
        6      1     1     1     0     0     0     0     0     0     0     0
        7      1     1     1     0     0     0     0     0     0     0     0
        8      1     1     0     0     0     0     0     0     0     0     0
        9      1     1     0     0     0     0     0     0     0     0     0
      10      1     1     0     0     0     0     0     0     0     0     0
      11      1     1     0     0     0     0     0     0     0     0     0
      12      1     0     0     0     0     0     0     0     0     0     0
      13      1     0     0     0     0     0     0     0     0     0     0
      14      1     0     0     0     0     0     0     0     0     0     0
      15      1     0     0     0     0     0     0     0     0     0     0
      16      0     0     0     0     0     0     0     0     0     0     0
      17      0     0     0     0     0     0     0     0     0     0     0
      18      0     0     0     0     0     0     0     0     0     0     0
      19      0     0     0     0     0     0     0     0     0     0     0
      
      Output of TASK_INTERACTIVE when using "floor"
      nice    dynamic priorities
      -20     1     1     1     1     1     1     1     1     1     0     0
      -19     1     1     1     1     1     1     1     1     1     0     0
      -18     1     1     1     1     1     1     1     1     1     0     0
      -17     1     1     1     1     1     1     1     1     1     0     0
      -16     1     1     1     1     1     1     1     1     0     0     0
      -15     1     1     1     1     1     1     1     1     0     0     0
      -14     1     1     1     1     1     1     1     1     0     0     0
      -13     1     1     1     1     1     1     1     1     0     0     0
      -12     1     1     1     1     1     1     1     0     0     0     0
      -11     1     1     1     1     1     1     1     0     0     0     0
      -10     1     1     1     1     1     1     1     0     0     0     0
        -9     1     1     1     1     1     1     1     0     0     0     0
        -8     1     1     1     1     1     1     0     0     0     0     0
        -7     1     1     1     1     1     1     0     0     0     0     0
        -6     1     1     1     1     1     1     0     0     0     0     0
        -5     1     1     1     1     1     1     0     0     0     0     0
        -4     1     1     1     1     1     0     0     0     0     0     0
        -3     1     1     1     1     1     0     0     0     0     0     0
        -2     1     1     1     1     1     0     0     0     0     0     0
        -1     1     1     1     1     1     0     0     0     0     0     0
         0     1     1     1     1     0     0     0     0     0     0     0
         1     1     1     1     1     0     0     0     0     0     0     0
         2     1     1     1     1     0     0     0     0     0     0     0
         3     1     1     1     1     0     0     0     0     0     0     0
         4     1     1     1     0     0     0     0     0     0     0     0
         5     1     1     1     0     0     0     0     0     0     0     0
         6     1     1     1     0     0     0     0     0     0     0     0
         7     1     1     1     0     0     0     0     0     0     0     0
         8     1     1     0     0     0     0     0     0     0     0     0
         9     1     1     0     0     0     0     0     0     0     0     0
        10     1     1     0     0     0     0     0     0     0     0     0
        11     1     1     0     0     0     0     0     0     0     0     0
        12     1     0     0     0     0     0     0     0     0     0     0
        13     1     0     0     0     0     0     0     0     0     0     0
        14     1     0     0     0     0     0     0     0     0     0     0
        15     1     0     0     0     0     0     0     0     0     0     0
        16     0     0     0     0     0     0     0     0     0     0     0
        17     0     0     0     0     0     0     0     0     0     0     0
        18     0     0     0     0     0     0     0     0     0     0     0
        19     0     0     0     0     0     0     0     0     0     0     0
      Signed-off-by: NMartin Andersson <martin.andersson@control.lth.se>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Williams <pwil3058@bigpond.net.au>
      Cc: Con Kolivas <kernel@kolivas.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      013d3868
  11. 27 3月, 2006 1 次提交
    • B
      [PATCH] kretprobe instance recycled by parent process · c6fd91f0
      bibo mao 提交于
      When kretprobe probes the schedule() function, if the probed process exits
      then schedule() will never return, so some kretprobe instances will never
      be recycled.
      
      In this patch the parent process will recycle retprobe instances of the
      probed function and there will be no memory leak of kretprobe instances.
      Signed-off-by: Nbibo mao <bibo.mao@intel.com>
      Cc: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
      Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c6fd91f0
  12. 23 3月, 2006 2 次提交
    • I
      [PATCH] make bug messages more consistent · 91368d73
      Ingo Molnar 提交于
      Consolidate all kernel bug printouts to begin with the "BUG: " string.
      Makes it easier to find them in large bootup logs.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      91368d73
    • A
      [PATCH] fix scheduler deadlock · e9028b0f
      Anton Blanchard 提交于
      We have noticed lockups during boot when stress testing kexec on ppc64.
      Two cpus would deadlock in scheduler code trying to grab already taken
      spinlocks.
      
      The double_rq_lock code uses the address of the runqueue to order the
      taking of multiple locks.  This address is a per cpu variable:
      
      	if (rq1 < rq2) {
      		spin_lock(&rq1->lock);
      		spin_lock(&rq2->lock);
      	} else {
      		spin_lock(&rq2->lock);
      		spin_lock(&rq1->lock);
      	}
      
      On the other hand, the code in wake_sleeping_dependent uses the cpu id
      order to grab locks:
      
      	for_each_cpu_mask(i, sibling_map)
      		spin_lock(&cpu_rq(i)->lock);
      
      This means we rely on the address of per cpu data increasing as cpu ids
      increase.  While this will be true for the generic percpu implementation it
      may not be true for arch specific implementations.
      
      One way to solve this is to always take runqueues in cpu id order. To do
      this we add a cpu variable to the runqueue and check it in the
      double runqueue locking functions.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e9028b0f
  13. 22 3月, 2006 1 次提交
  14. 12 3月, 2006 1 次提交
  15. 09 3月, 2006 1 次提交
  16. 07 3月, 2006 1 次提交
    • L
      Add early-boot-safety check to cond_resched() · 8ba7b0a1
      Linus Torvalds 提交于
      Just to be safe, we should not trigger a conditional reschedule during
      the early boot sequence.  We've historically done some questionable
      early on, and the safety warnings in __might_sleep() are generally
      turned off during that period, so there might be problems lurking.
      
      This affects CONFIG_PREEMPT_VOLUNTARY, which takes over might_sleep() to
      cause a voluntary conditional reschedule.
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8ba7b0a1
  17. 18 2月, 2006 1 次提交
    • I
      [PATCH] Introduce CONFIG_DEFAULT_MIGRATION_COST · 4bbf39c2
      Ingo Molnar 提交于
      Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
      
        The boot sequence on s390 sometimes takes ages and we spend a very long
        time (up to one or two minutes) in calibrate_migration_costs.  The time
        spent there differs from boot to boot.  Also the calculated costs differ
        a lot.  I've seen differences by up to a factor of 15 (yes, factor not
        percent).  Also I doubt that making these measurements make much sense on
        a completely virtualized architecture where you cannot tell how much cpu
        time you will get anyway.
      
      So introduce the CONFIG_DEFAULT_MIGRATION_COST method for an architecture
      to set the scheduler migration costs.  This turns off automatic detection
      of migration costs.  Makes sense on virtual platforms, where migration
      costs are hard to measure accurately.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4bbf39c2
  18. 15 2月, 2006 1 次提交
    • C
      [PATCH] sched: revert "filter affine wakeups" · d6077cb8
      Chen, Kenneth W 提交于
      Revert commit d7102e95:
      
          [PATCH] sched: filter affine wakeups
      
      Apparently caused more than 10% performance regression for aim7 benchmark.
      The setup in use is 16-cpu HP rx8620, 64Gb of memory and 12 MSA1000s with 144
      disks.  Each disk is 72Gb with a single ext3 filesystem (courtesy of HP, who
      supplied benchmark results).
      
      The problem is, for aim7, the wake-up pattern is random, but it still needs
      load balancing action in the wake-up path to achieve best performance.  With
      the above commit, lack of load balancing hurts that workload.
      
      However, for workloads like database transaction processing, the requirement
      is exactly opposite.  In the wake up path, best performance is achieved with
      absolutely zero load balancing.  We simply wake up the process on the CPU that
      it was previously run.  Worst performance is obtained when we do load
      balancing at wake up.
      
      There isn't an easy way to auto detect the workload characteristics.  Ingo's
      earlier patch that detects idle CPU and decide whether to load balance or not
      doesn't perform with aim7 either since all CPUs are busy (it causes even
      bigger perf.  regression).
      
      Revert commit d7102e95, which causes more
      than 10% performance regression with aim7.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d6077cb8
  19. 11 2月, 2006 1 次提交
    • N
      [PATCH] sched: remove smpnice · a2000572
      Nick Piggin 提交于
      I don't think the code is quite ready, which is why I asked for Peter's
      additions to also be merged before I acked it (although it turned out that
      it still isn't quite ready with his additions either).
      
      Basically I have had similar observations to Suresh in that it does not
      play nicely with the rest of the balancing infrastructure (and raised
      similar concerns in my review).
      
      The samples (group of 4) I got for "maximum recorded imbalance" on a 2x2
      SMP+HT Xeon are as follows:
      
                  | Following boot | hackbench 20        | hackbench 40
       -----------+----------------+---------------------+---------------------
       2.6.16-rc2 | 30,37,100,112  | 5600,5530,6020,6090 | 6390,7090,8760,8470
       +nosmpnice |  3, 2,  4,  2  |   28, 150, 294, 132 |  348, 348, 294, 347
      
      Hackbench raw performance is down around 15% with smpnice (but that in
      itself isn't a huge deal because it is just a benchmark).  However, the
      samples show that the imbalance passed into move_tasks is increased by
      about a factor of 10-30.  I think this would also go some way to explaining
      latency blips turning up in the balancing code (though I haven't actually
      measured that).
      
      We'll probably have to revert this in the SUSE kernel.
      
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Peter Williams <pwil3058@bigpond.net.au>
      Cc: "Martin J. Bligh" <mbligh@aracnet.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a2000572
  20. 06 2月, 2006 2 次提交
  21. 02 2月, 2006 1 次提交
    • J
      [PATCH] sys_sched_getaffinity() & hotplug · 2f7016d9
      Jack Steiner 提交于
      Change sched_getaffinity() so that it returns a bitmap that indicates the
      legally schedulable cpus that a task is allowed to run on.
      
      Without this patch, if CONFIG_HOTPLUG_CPU is enabled, sched_getaffinity()
      unconditionally returns (at least on IA64) a mask with NR_CPUS bits set.
      This conveys no useful infornmation except for a kernel compile option.
      
      This fixes a breakage we obseved running recent kernels. We have MPI jobs
      that use sched_getaffinity() to determine where to place their threads.
      Placing them on non-existant cpus is problematic :-)
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Nathan Lynch <ntl@pobox.com>
      Cc: Paul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2f7016d9
  22. 01 2月, 2006 1 次提交