- 15 9月, 2009 6 次提交
-
-
由 Peter Zijlstra 提交于
The problem with wake_idle() is that is doesn't respect things like cpu_power, which means it doesn't deal well with SMT nor the recent RT interaction. To cure this, it needs to do what sched_balance_self() does, which leads to the possibility of merging select_task_rq_fair() and sched_balance_self(). Modify sched_balance_self() to: - update_shares() when walking up the domain tree, (it only called it for the top domain, but it should have done this anyway), which allows us to remove this ugly bit from try_to_wake_up(). - do wake_affine() on the smallest domain that contains both this (the waking) and the prev (the wakee) cpu for WAKE invocations. Then use the top-down balance steps it had to replace wake_idle(). This leads to the dissapearance of SD_WAKE_BALANCE and SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE. SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective. Touch all topology bits to replace the old with new SD flags -- platforms might need re-tuning, enabling SD_BALANCE_WAKE conditionally on a NUMA distance seems like a good additional feature, magny-core and small nehalem systems would want this enabled, systems with slow interconnects would not. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
We're going to want to drop rq->lock in try_to_wake_up() for a longer period of time, however we also want to deal with concurrent waking of the same task, which is currently handled by holding rq->lock. So introduce a new TASK state, namely TASK_WAKING, which indicates someone is already waking the task (other wakers will fail p->state & state). We also keep preemption disabled over the whole ttwu(). Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Rather ugly patch to fully place the sched_balance_self() code inside the fair class. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Move the sched_balance_self() code into sched_fair.c This facilitates the merger of sched_balance_self() and sched_fair::select_task_rq(). Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
In preparation to other code movement, move weighted_cpuload(), source_load() and target_load() before the class includes. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 04 9月, 2009 10 次提交
-
-
由 Ingo Molnar 提交于
This crash: [ 1774.088275] divide error: 0000 [#1] SMP [ 1774.100355] CPU 13 [ 1774.102498] Modules linked in: [ 1774.105631] Pid: 30881, comm: hackbench Not tainted 2.6.31-rc8-tip-01308-g484d664-dirty #1629 X8DTN [ 1774.114807] RIP: 0010:[<ffffffff81041c38>] [<ffffffff81041c38>] sched_balance_self+0x19b/0x2d4 Triggers because update_group_power() modifies the sd tree and does temporary calculations there - not considering that other CPUs could observe intermediate values, such as the zero initial value. Calculate it in a temporary variable instead. (we need no memory barrier as these are all statistical values anyway) Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090904092742.GA11014@elte.hu> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Its a source of fail, also, now that cpu_power is dynamical, its a waste of time. before: <idle>-0 [000] 132.877936: find_busiest_group: avg_load: 0 group_load: 8241 power: 1 after: bash-1689 [001] 137.862151: find_busiest_group: avg_load: 10636288 group_load: 10387 power: 1 [ v2: build fix from From: Andreas Herrmann ] Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NGautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.425896304@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Gautham R Shenoy 提交于
sgs.group_capacity can now be 0, if for some reason group->__cpu_power happens to be less than SCHED_LOAD_SCALE/2. In that case, we need the following fix to make it work for update_sd_power_savings_stats(). That's because both sum_nr_running and group_capacity are unsigned longs. Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
When the capacity drops low, we want to migrate load away. Allow the load-balancer to remove all tasks when we hit rock bottom. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NGautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.342231003@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Keep an average on the amount of time spend on RT tasks and use that fraction to scale down the cpu_power for regular tasks. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NGautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.287778431@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Recompute the cpu_power for each cpu during load-balance. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NGautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.162033479@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
The idea is that multi-threading a core yields more work capacity than a single thread, provide a way to express a static gain for threads. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NGautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.073345955@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
In order to prepare for a more dynamic cpu_power, update the group sum while walking the sched domains during load-balance. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NGautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.985050292@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Do the placement thing using SD flags. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NGautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.897028974@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
cpu_power is supposed to be a representation of the process capacity of the cpu, not a value to randomly tweak in order to affect placement. Remove the placement hacks. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Acked-by: NGautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.810860576@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 9月, 2009 1 次提交
-
-
由 Arjan van de Ven 提交于
For counting how long an application has been waiting for (disk) IO, there currently is only the HZ sample driven information available, while for all other counters in this class, a high resolution version is available via CONFIG_SCHEDSTATS. In order to make an improved bootchart tool possible, we also need a higher resolution version of the iowait time. This patch below adds this scheduler statistic to the kernel. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4A64B813.1080506@linux.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 8月, 2009 1 次提交
-
-
由 Anirban Sinha 提交于
... so that it does not share a common name with a function within the same scope. Signed-off-by: NAnirban Sinha <asinha@zeugmasystems.com> LKML-Reference: <DDFD17CC94A9BD49A82147DDF7D545C501EA98A6@exchange.ZeugmaSystems.local> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 8月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
When re-computing the shares for each task group's cpu representation we need the ratio of weight on each cpu vs the total weight of the sched domain. Since load-balancing is loosely (read not) synchronized, the weight of individual cpus can change between doing the sum and calculating the ratio. The previous patch dealt with only one of the race scenarios, this patch side steps them all by saving a snapshot of all the individual cpu weights, thereby always working on a consistent set. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: torvalds@linux-foundation.org Cc: jes@sgi.com Cc: jens.axboe@oracle.com Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1251371336.18584.77.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 8月, 2009 1 次提交
-
-
由 Paul E. McKenney 提交于
Make RCU-sched, RCU-bh, and RCU-preempt be underlying implementations, with "RCU" defined in terms of one of the three. Update the outdated rcu_qsctr_inc() names, as these functions no longer increment anything. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josht@linux.vnet.ibm.com Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org LKML-Reference: <12509746132696-git-send-email-> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 8月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
Patch a5004278 (sched: Fix cgroup smp fairness) introduced the possibility of a divide-by-zero because load-balancing is not synchronized between sched_domains. This can cause the state of cpus to change between the first and second loop over the sched domain in tg_shares_up(). Reported-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jes Sorensen <jes@sgi.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1250855934.7538.30.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 8月, 2009 1 次提交
-
-
由 Hiroshi Shimamoto 提交于
Replace for loop with the macro for_each_class to cleanup. Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> LKML-Reference: <4A8A277D.4090304@ct.jp.nec.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 8月, 2009 12 次提交
-
-
由 Andreas Herrmann 提交于
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818110229.GM29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
... to further strip down __build_sched_domains(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818110111.GL29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
For the sake of completeness. Now all calls to init_sched_build_groups() are contained in build_sched_groups(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818110013.GK29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
... to further strip down __build_sched_domains(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105928.GJ29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
... to further strip down __build_sched_domains(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105838.GI29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
... to further strip down __build_sched_domains(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105751.GH29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
... to further strip down __build_sched_domains(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105703.GG29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
... to further strip down __build_sched_domains(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105614.GF29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
... to further strip down __build_sched_domains(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105455.GE29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
... to further strip down __build_sched_domains(). Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105406.GD29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105300.GC29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andreas Herrmann 提交于
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105152.GB29515@alberich.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 8月, 2009 6 次提交
-
-
由 Peter Zijlstra 提交于
Add a lockdep helper to validate that we indeed are the owner of a lock. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Like sched_migrate_task(), set_cpus_allowed_ptr() should hold onto the migration thread too. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Gregory Haskins 提交于
Reflect "active" cpus in the rq->rd->online field, instead of the online_map. The motivation is that things that use the root-domain code (such as cpupri) only care about cpus classified as "active" anyway. By synchronizing the root-domain state with the active map, we allow several optimizations. For instance, we can remove an extra cpumask_and from the scheduler hotpath by utilizing rq->rd->online (since it is now a cached version of cpu_active_map & rq->rd->span). Signed-off-by: NGregory Haskins <ghaskins@novell.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NMax Krasnyansky <maxk@qualcomm.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090730145723.25226.24493.stgit@dev.haskins.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Gregory Haskins 提交于
We currently have an explicit "needs_post" vtable method which returns a stack variable for whether we should later run post-schedule. This leads to an awkward exchange of the variable as it bubbles back up out of the context switch. Peter Zijlstra observed that this information could be stored in the run-queue itself instead of handled on the stack. Therefore, we revert to the method of having context_switch return void, and update an internal rq->post_schedule variable when we require further processing. In addition, we fix a race condition where we try to access current->sched_class without holding the rq->lock. This is technically racy, as the sched-class could change out from under us. Instead, we reference the per-rq post_schedule variable with the runqueue unlocked, but with preemption disabled to see if we need to reacquire the rq->lock. Finally, we clean the code up slightly by removing the #ifdef CONFIG_SMP conditionals from the schedule() call, and implement some inline helper functions instead. This patch passes checkpatch, and rt-migrate. Signed-off-by: NGregory Haskins <ghaskins@novell.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090729150422.17691.55590.stgit@dev.haskins.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Steven Rostedt 提交于
The current method for pushing RT tasks after scheduling only happens after a context switch. But we found cases where a task is set up on a run queue to be pushed but the push never happens because the schedule chooses the same task. This bug was found with the help of Gregory Haskins and the use of ftrace (trace_printk). It tooks several days for both of us analyzing the code and the trace output to find this. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090729042526.205923666@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
When cgroup group scheduling is built in, skip some code paths if we don't have any (but the root) cgroups configured. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-