- 10 11月, 2007 9 次提交
-
-
由 Srivatsa Vaddagiri 提交于
Sukadev Bhattiprolu reported a kernel crash with control groups. There are couple of problems discovered by Suka's test: - The test requires the cgroup filesystem to be mounted with atleast the cpu and ns options (i.e both namespace and cpu controllers are active in the same hierarchy). # mkdir /dev/cpuctl # mount -t cgroup -ocpu,ns none cpuctl (or simply) # mount -t cgroup none cpuctl -> Will activate all controllers in same hierarchy. - The test invokes clone() with CLONE_NEWNS set. This causes a a new child to be created, also a new group (do_fork->copy_namespaces->ns_cgroup_clone-> cgroup_clone) and the child is attached to the new group (cgroup_clone-> attach_task->sched_move_task). At this point in time, the child's scheduler related fields are uninitialized (including its on_rq field, which it has inherited from parent). As a result sched_move_task thinks its on runqueue, when it isn't. As a solution to this problem, I moved sched_fork() call, which initializes scheduler related fields on a new task, before copy_namespaces(). I am not sure though whether moving up will cause other side-effects. Do you see any issue? - The second problem exposed by this test is that task_new_fair() assumes that parent and child will be part of the same group (which needn't be as this test shows). As a result, cfs_rq->curr can be NULL for the child. The solution is to test for curr pointer being NULL in task_new_fair(). With the patch below, I could run ns_exec() fine w/o a crash. Reported-by: NSukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
clean up the preemption check to not use unnecessary 64-bit variables. This improves code size: text data bss dec hex filename 44227 3326 36 47589 b9e5 sched.o.before 44201 3326 36 47563 b9cb sched.o.after Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
clean up the wakeup preemption check. No code changed: text data bss dec hex filename 44227 3326 36 47589 b9e5 sched.o.before 44227 3326 36 47589 b9e5 sched.o.after Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
wakeup preemption fix: do not make it dependent on p->prio. Preemption purely depends on ->vruntime. This improves preemption in mixed-nice-level workloads. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
remove PREEMPT_RESTRICT. (this is a separate commit so that any regression related to the removal itself is bisectable) Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Yanmin Zhang reported an aim7 regression and bisected it down to: | commit 38ad464d | Author: Ingo Molnar <mingo@elte.hu> | Date: Mon Oct 15 17:00:02 2007 +0200 | | sched: uniform tunings | | use the same defaults on both UP and SMP. fix this by reintroducing similar SMP tunings again. This resolves the regression. (also update the comments to match the ilog2(nr_cpus) tuning effect) Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
we lost the sched_min_granularity tunable to a clever optimization that uses the sched_latency/min_granularity ratio - but the ratio is quite unintuitive to users and can also crash the kernel if the ratio is set to 0. So reintroduce the min_granularity tunable, while keeping the ratio maintained internally. no functionality changed. [ mingo@elte.hu: some fixlets. ] Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Add a few comments to place_entity(). No code changed. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
vslice was missing a factor NICE_0_LOAD, as weight is in weight*NICE_0_LOAD units. the effect of this bug was larger initial slices and thus latency-noisier forks. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 10月, 2007 1 次提交
-
-
由 Ingo Molnar 提交于
fix style of swap() macro in kernel/sched_fair.c. ( this macro should eventually move to a general header, as ext3 uses a similar construct too. ) Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 25 10月, 2007 2 次提交
-
-
由 Peter Williams 提交于
At the moment, a lot of load balancing code that is irrelevant to non SMP systems gets included during non SMP builds. This patch addresses this issue and reduces the binary size on non SMP systems: text data bss dec hex filename 10983 28 1192 12203 2fab sched.o.before 10739 28 1192 11959 2eb7 sched.o.after Signed-off-by: NPeter Williams <pwil3058@bigpond.net.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Williams 提交于
At the moment, balance_tasks() provides low level functionality for both move_tasks() and move_one_task() (indirectly) via the load_balance() function (in the sched_class interface) which also provides dual functionality. This dual functionality complicates the interfaces and internal mechanisms and makes the run time overhead of operations that are called with two run queue locks held. This patch addresses this issue and reduces the overhead of these operations. Signed-off-by: NPeter Williams <pwil3058@bigpond.net.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 17 10月, 2007 1 次提交
-
-
由 Srivatsa Vaddagiri 提交于
Child task may be added on a different cpu that the one on which parent is running. In which case, task_new_fair() should check whether the new born task's parent entity should be added as well on the cfs_rq. Patch below fixes the problem in task_new_fair. This could fix the put_prev_task_fair() crashes reported. Reported-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com> Reported-by: NAndy Whitcroft <apw@shadowen.org> Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 10月, 2007 27 次提交
-
-
由 Ingo Molnar 提交于
reintroduce a simplified version of cache-hot/cold scheduling affinity. This improves performance with certain SMP workloads, such as sysbench. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
speed up context-switches a bit by not clearing p->exec_start. (as a side-effect, this also makes p->exec_start a universal timestamp available to cache-hot estimations.) Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
do not wakeup-preempt with SCHED_BATCH tasks, their preemption is batched too, driven by the tick. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
update comment: clarify time-slices and remove obsolete tuning detail. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Galbraith 提交于
Prevent wakeup over-scheduling. Once a task has been preempted by a task of the same or lower priority, it becomes ineligible for repeated preemption by same until it has been ticked, or slept. Instead, the task is marked for preemption at the next tick. Tasks of higher priority still preempt immediately. Signed-off-by: NMike Galbraith <efault@gmx.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Implement feature bit to disable forced preemption. This way it can be checked whether a workload is overscheduling or not. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dmitry Adamushko 提交于
The following patch (sched: disable sleeper_fairness on SCHED_BATCH) seems to break GROUP_SCHED. Although, it may be 'oops'-less due to the possibility of 'p' being always a valid address. Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
disable sleeper fairness for batch tasks - they are about batch processing after all. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
unit mis-match: wakeup_gran was used against a vruntime Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
noticed by Peter Zijlstra: fix: move the CPU check into ->task_new_fair(), this way we can call place_entity() and get child ->vruntime right at initial wakeup time. (without this there can be large latencies) Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
-
由 Ingo Molnar 提交于
noticed by Thomas Gleixner: cleanup: function prototype cleanups - move into single line wherever possible. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
cleanup: rename SCHED_FEAT_USE_TREE_AVG to SCHED_FEAT_TREE_AVG, to make SCHED_FEAT_ names more consistent. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dmitry Adamushko 提交于
make dequeue_entity() / enqueue_entity() and update_stats_dequeue() / update_stats_enqueue() look similar, structure-wise. zero effect, functionality-wise: text data bss dec hex filename 34550 3026 100 37676 932c sched.o.before 34550 3026 100 37676 932c sched.o.after Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dmitry Adamushko 提交于
remove obsolete code -- calc_weighted() Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Alexey Dobriyan 提交于
* save ~300 bytes * activate_idle_task() was moved to avoid a warning bloat-o-meter output: add/remove: 6/0 grow/shrink: 0/16 up/down: 438/-733 (-295) <=== function old new delta __enqueue_entity - 165 +165 finish_task_switch - 110 +110 update_curr_rt - 79 +79 __load_balance_iterator - 32 +32 __task_rq_unlock - 28 +28 find_process_by_pid - 24 +24 do_sched_setscheduler 133 123 -10 sys_sched_rr_get_interval 176 165 -11 sys_sched_getparam 156 145 -11 normalize_rt_tasks 482 470 -12 sched_getaffinity 112 99 -13 sys_sched_getscheduler 86 72 -14 sched_setaffinity 226 212 -14 sched_setscheduler 666 642 -24 load_balance_start_fair 33 9 -24 load_balance_next_fair 33 9 -24 dequeue_task_rt 133 67 -66 put_prev_task_rt 97 28 -69 schedule_tail 133 50 -83 schedule 682 594 -88 enqueue_entity 499 366 -133 task_new_fair 317 180 -137 Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
tweak wakeup granularity. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dmitry Adamushko 提交于
The thing is that __pick_next_entity() must never be called when first_fair(cfs_rq) == NULL. It wouldn't be a problem, should 'run_node' be the very first field of 'struct sched_entity' (and it's the second). The 'nr_running != 0' check is _not_ enough, due to the fact that 'current' is not within the tree. Generic paths are ok (e.g. schedule() as put_prev_task() is called previously)... I'm more worried about e.g. migration_call() -> CPU_DEAD_FROZEN -> migrate_dead_tasks()... if 'current' == rq->idle, no problems.. if it's one of the SCHED_NORMAL tasks (or imagine, some other use-cases in the future -- i.e. we should not make outer world dependent on internal details of sched_fair class) -- it may be "Houston, we've got a problem" case. it's +16 bytes to the ".text". Another variant is to make 'run_node' the first data member of 'struct sched_entity' but an additional check (se ! = NULL) is still needed in pick_next_entity(). Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ingo Molnar 提交于
Make vslice accurate wrt nice levels, and add some comments while we're at it. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ingo Molnar 提交于
mark scheduling classes as const. The speeds up the code a bit and shrinks it: text data bss dec hex filename 40027 4018 292 44337 ad31 sched.o.before 40190 3842 292 44324 ad24 sched.o.after Signed-off-by: NIngo Molnar <mingo@elte.hu> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Srivatsa Vaddagiri 提交于
There is a possibility that because of task of a group moving from one cpu to another, it may gain more cpu time that desired. See http://marc.info/?l=linux-kernel&m=119073197730334 for details. This is an attempt to fix that problem. Basically it simulates dequeue of higher level entities as if they are going to sleep. Similarly it simulate wakeup of higher level entities as if they are waking up from sleep. Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Srivatsa Vaddagiri 提交于
Recent fix to check_preempt_wakeup() to check for preemption at higher levels caused a size bloat for !CONFIG_FAIR_GROUP_SCHED. Fix the problem. 42277 10598 320 53195 cfcb kernel/sched.o-before_this_patch 42216 10598 320 53134 cf8e kernel/sched.o-after_this_patch Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Ingo Molnar 提交于
cleanup, remove stale comment. Signed-off-by: NIngo Molnar <mingo@elte.hu> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Peter Zijlstra 提交于
speed up and simplify vslice calculations. [ From: Mike Galbraith <efault@gmx.de>: build fix ] Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
clean up min_vruntime use. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dmitry Adamushko 提交于
fix yield bugs due to the current-not-in-rbtree changes: the task is not in the rbtree so rbtree-removal is a no-no. [ From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>: build fix. ] also, nice code size reduction: kernel/sched.o: text data bss dec hex filename 38323 3506 24 41853 a37d sched.o.before 38236 3506 24 41766 a326 sched.o.after Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Srivatsa Vaddagiri 提交于
group scheduler wakeup latency fix: when checking for preemption we must check cross-group too, not just intra-group. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Lee Schermerhorn noticed that set_leftmost() contains dead code, remove this. Reported-by: NLee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
-