- 25 11月, 2008 19 次提交
-
-
由 Rusty Russell 提交于
Impact: (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: stack usage reduction, (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. The fact cpupro_init is called both before and after the slab is available makes for an ugly parameter unfortunately. We also use cpumask_any_and to get rid of a temporary in cpupri_find. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: stack usage reduction, (future) size reduction, cleanup Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. We can also use cpulist_parse() instead of doing it manually in isolated_cpu_setup. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: stack usage reduction Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves stack space. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. In this case, we always alloced, but we don't need to any more. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: stack usage reduction Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space on the stack. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Note the removal of the initializer of new_mask: since the first thing we did was "cpus_and(new_mask, new_mask, cpus_allowed)" I just changed that to "cpumask_and(new_mask, in_mask, cpus_allowed);". Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: stack usage reduction With some care, we can avoid needing a temporary cpumask (we can't really allocate here, since we can't fail). This version calls cpuset_cpus_allowed_locked() with the task_rq_lock held. I'm fairly sure this works, but there might be a deadlock hiding. And of course, we can't get rid of the last cpumask on stack until we can use cpumask_of_node instead of node_to_cpumask. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: stack usage reduction Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space in the stack. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Some jiggling here to make sure we always exit at the bottom (so we hit the free_cpumask_var there). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: stack usage reduction Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space in the stack. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: stack usage reduction Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space in the stack. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. def_root_domain is static, and so its masks are initialized with alloc_bootmem_cpumask_var. After that, alloc_cpumask_var is used. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: (future) size reduction for large NR_CPUS. Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. cpumask_var_t is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: (future) size reduction for large NR_CPUS. We move the 'cpumask' member of sched_group to the end, so when we kmalloc it we can do a minimal allocation: saves space for small nr_cpu_ids but big CONFIG_NR_CPUS. Similar trick for 'span' in sched_domain. This isn't quite as good as converting to a cpumask_var_t, as some sched_groups are actually static, but it's safer: we don't have to figure out where to call alloc_cpumask_var/free_cpumask_var. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: trivial wrap of member accesses This eases the transition in the next patch. We also get rid of a temporary cpumask in find_idlest_cpu() thanks to for_each_cpu_and, and sched_balance_self() due to getting weight before setting sd to NULL. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: use new API any_online_cpu() is a good name, but it takes a cpumask_t, not a pointer. There are several places where any_online_cpu() doesn't really want a mask arg at all. Replace all callers with cpumask_any() and cpumask_any_and(). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: use new general API Using lots of allocs rather than one big alloc is less efficient, but who cares for this setup function? Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com> Acked-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
Impact: trivial API conversion This is a simple conversion, but note that for_each_cpu() terminates with i >= nr_cpu_ids, not i == NR_CPUS like for_each_cpu_mask() did. I don't convert all of them: sd->span changes in a later patch, so change those iterators there rather than here. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
Impact: cleanup * use node_to_cpumask_ptr in place of node_to_cpumask to reduce stack requirements in sched.c Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 11月, 2008 2 次提交
-
-
由 Ingo Molnar 提交于
Impact: cleanup Eliminate #ifdefs in core code by using empty inline functions. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Impact: use deeper function tracing depth safely Some tests showed that function return tracing needed a more deeper depth of function calls. But it could be unsafe to store these return addresses to the stack. So these arrays will now be allocated dynamically into task_struct of current only when the tracer is activated. Typical scheme when tracer is activated: - allocate a return stack for each task in global list. - fork: allocate the return stack for the newly created task - exit: free return stack of current - idle init: same as fork I chose a default depth of 50. I don't have overruns anymore. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 11月, 2008 1 次提交
-
-
由 Vegard Nossum 提交于
Impact: cleanup This commit: commit f7b4cddc Author: Oleg Nesterov <oleg@tv-sign.ru> Date: Tue Oct 16 23:30:56 2007 -0700 do CPU_DEAD migrating under read_lock(tasklist) instead of write_lock_irq(ta Currently move_task_off_dead_cpu() is called under write_lock_irq(tasklist). This means it can't use task_lock() which is needed to improve migrating to take task's ->cpuset into account. Change the code to call move_task_off_dead_cpu() with irqs enabled, and change migrate_live_tasks() to use read_lock(tasklist). ...forgot to update the comment in front of move_task_off_dead_cpu. Reference: http://lkml.org/lkml/2008/6/23/135Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 11月, 2008 1 次提交
-
-
由 Ken Chen 提交于
Impact: make load-balancing more consistent In the update_shares() path leading to tg_shares_up(), the calculation of per-cpu cfs_rq shares is rather erratic even under moderate task wake up rate. The problem is that the per-cpu tg->cfs_rq load weight used in the sd_rq_weight aggregation and actual redistribution of the cfs_rq->shares are collected at different time. Under moderate system load, we've seen quite a bit of variation on the cfs_rq->shares and ultimately wildly affects sched_entity's load weight. This patch caches the result of initial per-cpu load weight when doing the sum calculation, and then pass it down to update_group_shares_cpu() for redistributing per-cpu cfs_rq shares. This allows consistent total cfs_rq shares across all CPUs. It also simplifies the rounding and zero load weight check. Signed-off-by: NKen Chen <kenchen@google.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 11月, 2008 1 次提交
-
-
由 Li Zefan 提交于
Impact: properly rebuild sched-domains on kmalloc() failure When cpuset failed to generate sched domains due to kmalloc() failure, the scheduler should fallback to the single partition 'fallback_doms' and rebuild sched domains, but now it only destroys but not rebuilds sched domains. The regression was introduced by: | commit dfb512ec | Author: Max Krasnyansky <maxk@qualcomm.com> | Date: Fri Aug 29 13:11:41 2008 -0700 | | sched: arch_reinit_sched_domains() must destroy domains to force rebuild After the above commit, partition_sched_domains(0, NULL, NULL) will only destroy sched domains and partition_sched_domains(1, NULL, NULL) will create the default sched domain. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: <stable@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 17 11月, 2008 1 次提交
-
-
由 Oleg Nesterov 提交于
Impact: remove unnecessary accounting call I don't actually understand account_steal_time() and I failed to find the commit which added account_group_system_time(), but this looks bogus. In any case rq->idle must be single-threaded, so it can't have ->totals. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 16 11月, 2008 1 次提交
-
-
由 Mathieu Desnoyers 提交于
Impact: API *CHANGE*. Must update all tracepoint users. Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint structure in a single spot for all the kernel. It helps reducing memory consumption, especially when declaring a lot of tracepoints, e.g. for kmalloc tracing. *API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for tracepoint declarations rather than DEFINE_TRACE(). This is the sane way to do it. The name previously used was misleading. Updates scheduler instrumentation to follow this API change. Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 11月, 2008 1 次提交
-
-
由 Ingo Molnar 提交于
Maciej Rutecki reported: > I have this bug during suspend to disk: > > [ 188.592151] Enabling non-boot CPUs ... > [ 188.592151] SMP alternatives: switching to SMP code > [ 188.666058] BUG: using smp_processor_id() in preemptible > [00000000] > code: suspend_to_disk/2934 > [ 188.666064] caller is native_sched_clock+0x2b/0x80 Which, as noted by Linus, was caused by me, via: 7cbaef9c "sched: optimize sched_clock() a bit" Move the rq locking a bit earlier in the initialization sequence, that will make the sched_clock() call in init_idle() non-preemptible. Reported-by: NMaciej Rutecki <maciej.rutecki@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 11月, 2008 1 次提交
-
-
由 Balbir Singh 提交于
Impact: fix load balancer load average calculation accuracy cpu_avg_load_per_task() returns a stale value when nr_running is 0. It returns an older stale (caculated when nr_running was non zero) value. This patch returns and sets rq->avg_load_per_task to zero when nr_running is 0. Compile and boot tested on a x86_64 box. Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 11月, 2008 2 次提交
-
-
由 Bharata B Rao 提交于
Impact: improve CPU time accounting of tasks under the cpu accounting controller Add hierarchical accounting to cpu accounting controller and include cpuacct documentation. Currently, while charging the task's cputime to its accounting group, the accounting group hierarchy isn't updated. This patch charges the cputime of a task to its accounting group and all its parent accounting groups. Reported-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NBharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: NPaul Menage <menage@google.com> Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Oleg Nesterov 提交于
Impact: fix hang/crash on ia64 under high load This is ugly, but the simplest patch by far. Unlike other similar routines, account_group_exec_runtime() could be called "implicitly" from within scheduler after exit_notify(). This means we can race with the parent doing release_task(), we can't just check ->signal != NULL. Change __exit_signal() to do spin_unlock_wait(&task_rq(tsk)->lock) before __cleanup_signal() to make sure ->signal can't be freed under task_rq(tsk)->lock. Note that task_rq_unlock_wait() doesn't care about the case when tsk changes cpu/rq under us, this should be OK. Thanks to Ingo who nacked my previous buggy patch. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Reported-by: NDoug Chapman <doug.chapman@hp.com>
-
- 10 11月, 2008 1 次提交
-
-
由 Peter Zijlstra 提交于
Impact: clean up and fix debug info printout While looking over the sched_debug code I noticed that we printed the rq schedstats for every cfs_rq, ammend this. Also change nr_spead_over into an int, and fix a little buglet in min_vruntime printing. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 11月, 2008 4 次提交
-
-
由 Li Zefan 提交于
Impact: cleanup The #if/#endif is ugly. Change SCHED_CPUMASK_ALLOC and SCHED_CPUMASK_FREE to static inline functions. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Li Zefan 提交于
Impact: fix rare memory leak in the sched-domains manual reconfiguration code In the failure path, rd is not attached to a sched domain, so it causes a leak. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Li Zefan 提交于
Impact: re-add incorrectly eliminated sched domain layers (1) on i386 with SCHED_SMT and SCHED_MC enabled # mount -t cgroup -o cpuset xxx /mnt # echo 0 > /mnt/cpuset.sched_load_balance # mkdir /mnt/0 # echo 0 > /mnt/0/cpuset.cpus # dmesg CPU0 attaching sched-domain: domain 0: span 0 level CPU groups: 0 (2) on i386 with SCHED_MC enabled but SCHED_SMT disabled # same with (1) # dmesg CPU0 attaching NULL sched-domain. The bug is that some sched domains may be skipped unintentionally when degenerating (optimizing) sched domains. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Sripathi Kodi 提交于
We have a test case which measures the variation in the amount of time needed to perform a fixed amount of work on the preempt_rt kernel. We started seeing deterioration in it's performance recently. The test should never take more than 10 microseconds, but we started 5-10% failure rate. Using elimination method, we traced the problem to commit 1b12bbc7 (lockdep: re-annotate scheduler runqueues). When LOCKDEP is disabled, this patch only adds an additional function call to double_unlock_balance(). Hence I inlined double_unlock_balance() and the problem went away. Here is a patch to make this change. Signed-off-by: NSripathi Kodi <sripathik@in.ibm.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 11月, 2008 1 次提交
-
-
由 Peter Zijlstra 提交于
Impact: improve/change/fix wakeup-buddy scheduling Currently we only have a forward looking buddy, that is, we prefer to schedule to the task we last woke up, under the presumption that its going to consume the data we just produced, and therefore will have cache hot benefits. This allows co-waking producer/consumer task pairs to run ahead of the pack for a little while, keeping their cache warm. Without this, we would interleave all pairs, utterly trashing the cache. This patch introduces a backward looking buddy, that is, suppose that in the above scenario, the consumer preempts the producer before it can go to sleep, we will therefore miss the wakeup from consumer to producer (its already running, after all), breaking the cycle and reverting to the cache-trashing interleaved schedule pattern. The backward buddy will try to schedule back to the task that woke us up in case the forward buddy is not available, under the assumption that the last task will be the one with the most cache hot task around barring current. This will basically allow a task to continue after it got preempted. In order to avoid starvation, we allow either buddy to get wakeup_gran ahead of the pack. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NMike Galbraith <efault@gmx.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 04 11月, 2008 3 次提交
-
-
由 Li Zefan 提交于
Impact: cleanup, add debug check It's wrong to make dattr_new = NULL if doms_new == NULL, it introduces memory leak if dattr_new != NULL. Fortunately dattr_new is always NULL in this case. So remove the code and add a sanity check. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Li Zefan 提交于
Impact: cleanup The sysctl has been unregistered by partition_sched_domains(). Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Li Zefan 提交于
Impact: cleanup Just use the newly introduced sd->name. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 10月, 2008 1 次提交
-
-
由 Li Zefan 提交于
Impact: cleanup So handling of sched_features read is simplified. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-