- 09 1月, 2009 5 次提交
-
-
由 Li Zefan 提交于
Impact: reduce stack usage Allocate a global cpumask_var_t at boot, and use it in cpuset_attach(), so we won't fail cpuset_attach(). Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Impact: reduce stack usage Just use cs->cpus_allowed, and no need to allocate a cpumask_var_t. Signed-off-by: NLi Zefan <lizf@cn.fujistu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
This patchset converts cpuset to use new cpumask API, and thus remove on stack cpumask_t to reduce stack usage. Before: # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t 21 After: # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t 0 This patch: Impact: reduce stack usage It's safe to call cpulist_scnprintf inside callback_mutex, and thus we can just remove the cpumask_t and no need to allocate a cpumask_var_t. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miao Xie 提交于
I found a bug on my dual-cpu box. I created a sub cpuset in top cpuset and assign 1 to its cpus. And then we attach some tasks into this sub cpuset. After this, we offline CPU1. Now, the tasks in this new cpuset are moved into top cpuset automatically because there is no cpu in sub cpuset. Then we online CPU1, we find all the tasks which doesn't belong to top cpuset originally just run on CPU0. We fix this bug by setting task's cpu_allowed to cpu_possible_map when attaching it into top cpuset. This method needn't modify the current behavior of cpusets on CPU hotplug, and all of tasks in top cpuset use cpu_possible_map to initialize their cpu_allowed. Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
task_cs() calls task_subsys_state(). We must use rcu_read_lock() to protect cgroup_subsys_state(). It's correct that top_cpuset is never freed, but cgroup_subsys_state() accesses css_set, this css_set maybe freed when task_cs() called. We use use rcu_read_lock() to protect it. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 1月, 2009 1 次提交
-
-
由 David Rientjes 提交于
When cpusets are enabled, it's necessary to print the triggering task's set of allowable nodes so the subsequently printed meminfo can be interpreted correctly. We also print the task's cpuset name for informational purposes. [rientjes@google.com: task lock current before dereferencing cpuset] Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 12月, 2008 1 次提交
-
-
由 Rusty Russell 提交于
cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers. Impact: change calling convention of existing cpumask APIs Most cpumask functions started with cpus_: these have been replaced by cpumask_ ones which take struct cpumask pointers as expected. These four functions don't have good replacement names; fortunately they're rarely used, so we just change them over. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com> Acked-by: NIngo Molnar <mingo@elte.hu> Cc: paulus@samba.org Cc: mingo@redhat.com Cc: tony.luck@intel.com Cc: ralf@linux-mips.org Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: cl@linux-foundation.org Cc: srostedt@redhat.com
-
- 30 11月, 2008 1 次提交
-
-
由 Ingo Molnar 提交于
this warning: kernel/cpuset.c: In function ‘generate_sched_domains’: kernel/cpuset.c:588: warning: ‘ndoms’ may be used uninitialized in this function triggers because GCC does not recognize that ndoms stays uninitialized only if doms is NULL - but that flow is covered at the end of generate_sched_domains(). Help out GCC by initializing this variable to 0. (that's prudent anyway) Also, this function needs a splitup and code flow simplification: with 160 lines length it's clearly too long. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 11月, 2008 1 次提交
-
-
由 Miao Xie 提交于
After adding a node into the machine, top cpuset's mems isn't updated. By reviewing the code, we found that the update function cpuset_track_online_nodes() was invoked after node_states[N_ONLINE] changes. It is wrong because N_ONLINE just means node has pgdat, and if node has/added memory, we use N_HIGH_MEMORY. So, We should invoke the update function after node_states[N_HIGH_MEMORY] changes, just like its commit says. This patch fixes it. And we use notifier of memory hotplug instead of direct calling of cpuset_track_online_nodes(). Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Acked-by: NYasunori Goto <y-goto@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Paul Menage <menage@google.com Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 11月, 2008 1 次提交
-
-
由 Li Zefan 提交于
Impact: properly rebuild sched-domains on kmalloc() failure When cpuset failed to generate sched domains due to kmalloc() failure, the scheduler should fallback to the single partition 'fallback_doms' and rebuild sched domains, but now it only destroys but not rebuilds sched domains. The regression was introduced by: | commit dfb512ec | Author: Max Krasnyansky <maxk@qualcomm.com> | Date: Fri Aug 29 13:11:41 2008 -0700 | | sched: arch_reinit_sched_domains() must destroy domains to force rebuild After the above commit, partition_sched_domains(0, NULL, NULL) will only destroy sched domains and partition_sched_domains(1, NULL, NULL) will create the default sched domain. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: <stable@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 10月, 2008 2 次提交
-
-
由 Lai Jiangshan 提交于
1) seq_file excepts that m->count == m->size when it's buf is full, so current code will causes bugs when buf is overflow. 2) There is not too good that cpuset accesses struct seq_file's fields directly. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rakib Mullick 提交于
Remove the use of int cpus_nonempty variable from 'update_flag' function. Signed-off-by: NMd.Rakib H. Mullick <rakib.mullick@gmail.com> Acked-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 10月, 2008 1 次提交
-
-
由 Frederic Weisbecker 提交于
This fixes a warning on latest -tip: kernel/cpuset.c: Dans la fonction «scan_for_empty_cpusets» : kernel/cpuset.c:1932: attention : passing argument 1 of «list_add_tail» discards qualifiers from pointer target type Actually the struct cpuset *root passed in parameter to scan_for_empty_cpusets is not supposed to be const since an entry is added on the tail of its list. Just correct the qualifier. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 14 9月, 2008 1 次提交
-
-
由 Li Zefan 提交于
After the patch: commit 0b2f630a Author: Miao Xie <miaox@cn.fujitsu.com> Date: Fri Jul 25 01:47:21 2008 -0700 cpusets: restructure the function update_cpumask() and update_nodemask() It might happen that 'echo 0 > /cpuset/sub/cpus' returned failure but 'cpus' has been changed, because cpus was changed before calling heap_init() which may return -ENOMEM. This patch restores the orginal behavior. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 8月, 2008 1 次提交
-
-
由 Max Krasnyansky 提交于
This is an updated version of my previous cpuset patch on top of the latest mainline git. The patch fixes CPU hotplug handling issues in the current cpusets code. Namely circular locking in rebuild_sched_domains() and unsafe access to the cpu_online_map in the cpuset cpu hotplug handler. This version includes changes suggested by Paul Jackson (naming, comments, style, etc). I also got rid of the separate workqueue thread because it is now safe to call get_online_cpus() from workqueue callbacks. Here are some more details: rebuild_sched_domains() is the only way to rebuild sched domains correctly based on the current cpuset settings. What this means is that we need to be able to call it from different contexts, like cpu hotplug for example. Also latest scheduler code in -tip now calls rebuild_sched_domains() directly from functions like arch_reinit_sched_domains(). In order to support that properly we need to rework cpuset locking rules to avoid circular dependencies, which is what this patch does. New lock nesting rules are explained in the comments. We can now safely call rebuild_sched_domains() from virtually any context. The only requirement is that it needs to be called under get_online_cpus(). This allows cpu hotplug handlers and the scheduler to call rebuild_sched_domains() directly. The rest of the cpuset code now offloads sched domains rebuilds to a workqueue (async_rebuild_sched_domains()). This version of the patch addresses comments from the previous review. I fixed all miss-formated comments and trailing spaces. I also factored out the code that builds domain masks and split up CPU and memory hotplug handling. This was needed to simplify locking, to avoid unsafe access to the cpu_online_map from mem hotplug handler, and in general to make things cleaner. The patch passes moderate testing (building kernel with -j 16, creating & removing domains and bringing cpus off/online at the same time) on the quad-core2 based machine. It passes lockdep checks, even with preemptable RCU enabled. This time I also tested in with suspend/resume path and everything is working as expected. Signed-off-by: NMax Krasnyansky <maxk@qualcomm.com> Acked-by: NPaul Jackson <pj@sgi.com> Cc: menage@google.com Cc: a.p.zijlstra@chello.nl Cc: vegard.nossum@gmail.com Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 7月, 2008 4 次提交
-
-
由 Li Zefan 提交于
Use cpuset.stack_list rather than kfifo, so we avoid memory allocation for kfifo. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
When multiple cpusets are overlapping in their 'cpus' and hence they form a single sched domain, the largest sched_relax_domain_level among those should be used. But when top_cpuset's sched_load_balance is set, its sched_relax_domain_level is used regardless other sub-cpusets'. This patch fixes it by walking the cpuset hierarchy to find the largest sched_relax_domain_level. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
All child cpusets contain a subset of the parent's cpus, so we can skip them when partitioning sched domains. This decreases 'csa' greately for cpusets with multi-level hierarchy. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
clean up hierarchy traversal code Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Cliff Wickman <cpw@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 7月, 2008 7 次提交
-
-
由 Lai Jiangshan 提交于
In cpuset_update_task_memory_state() local variable struct task_struct *tsk = current; And local variable tsk is used 14 times and statement task_cs(tsk) is used twice in this function. So using task_cs(tsk) instead of task_cs(current) is better for readability. And "(struct cgroup_scanner *)&scan" is not good for readability also. (and "container_of" is used in cpuset_do_move_task(), not "(cpuset_hotplug_scanner *)scan") Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
cgroup(cgroup_scan_tasks) will initialize heap->gt for us. This patch removes started_after() and its helper-function. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
I create lots of empty cpusets(empty cpumasks) and turn off the "sched_load_balance" in top cpuset. I found that all these empty cpumasks are passed to partition_sched_domains() in rebuild_sched_domains(), it's very time-consuming for partition_sched_domains() and it's not need. It also reduce memory consumed and some works in rebuild_sched_domains() too. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
When changing 'sched_relax_domain_level', don't rebuild sched domains if 'cpus' is empty or 'sched_load_balance' is not set. Also make the comments of rebuild_sched_domains() more readable. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miao Xie 提交于
The bug is that a task may run on the cpu/node which is not in its cpuset.cpus/ cpuset.mems. It can be reproduced by the following commands: ----------------------------------- # mkdir /dev/cpuset # mount -t cpuset xxx /dev/cpuset # mkdir /dev/cpuset/0 # echo 0-1 > /dev/cpuset/0/cpus # echo 0 > /dev/cpuset/0/mems # echo $$ > /dev/cpuset/0/tasks # echo 0 > /sys/devices/system/cpu/cpu1/online # echo 1 > /sys/devices/system/cpu/cpu1/online ----------------------------------- There is only CPU0 in cpuset.cpus, but the task in this cpuset runs on both CPU0 and CPU1. It is because the task's cpu_allowed didn't get updated after we did CPU offline/online manipulation. Similar for mem_allowed. This patch fixes this bug expect for root cpuset. Because there is a problem about root cpuset, in that whether it is necessary to update all the tasks in root cpuset or not after cpu/node offline/online. If updating, some kernel threads which is bound into a specified cpu will be unbound. If not updating, there is a bug in root cpuset. This bug is also caused by offline/online manipulation. For example, there is a dual-cpu machine. we create a sub cpuset in root cpuset and assign 1 to its cpus. And then we attach some tasks into this sub cpuset. After this, we offline CPU1. Now, the tasks in this new cpuset are moved into root cpuset automatically because there is no cpu in sub cpuset. Then we online CPU1, we find all the tasks which doesn't belong to root cpuset originally just run on CPU0. Maybe we need to add a flag in the task_struct to mark which task can't be unbound? Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Acked-by: NPaul Jackson <pj@sgi.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Paul Menage <menage@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miao Xie 提交于
Extract two functions from update_cpumask() and update_nodemask().They will be used later for updating tasks' cpus_allowed and mems_allowed after CPU/NODE offline/online. [lizf@cn.fujitsu.com: build fix] Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Acked-by: NPaul Jackson <pj@sgi.com> Cc: David Rientjes <rientjes@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
This patch tweaks the signatures of the update_cpumask() and update_nodemask() functions so that they can be called directly as handlers for the new cgroups write_string() method. This allows cpuset_common_file_write() to be removed. Signed-off-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 7月, 2008 1 次提交
-
-
由 Miao Xie 提交于
Fix wrong domain attr updates, or we will always update the first sched domain attr. Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> [2.6.26.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 7月, 2008 1 次提交
-
-
由 Max Krasnyansky 提交于
This is based on Linus' idea of creating cpu_active_map that prevents scheduler load balancer from migrating tasks to the cpu that is going down. It allows us to simplify domain management code and avoid unecessary domain rebuilds during cpu hotplug event handling. Please ignore the cpusets part for now. It needs some more work in order to avoid crazy lock nesting. Although I did simplfy and unify domain reinitialization logic. We now simply call partition_sched_domains() in all the cases. This means that we're using exact same code paths as in cpusets case and hence the test below cover cpusets too. Cpuset changes to make rebuild_sched_domains() callable from various contexts are in the separate patch (right next after this one). This not only boots but also easily handles while true; do make clean; make -j 8; done and while true; do on-off-cpu 1; done at the same time. (on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing). Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing this on right now in gnome-terminal and things are moving just fine. Also this is running with most of the debug features enabled (lockdep, mutex, etc) no BUG_ONs or lockdep complaints so far. I believe I addressed all of the Dmitry's comments for original Linus' version. I changed both fair and rt balancer to mask out non-active cpus. And replaced cpu_is_offline() with !cpu_active() in the main scheduler code where it made sense (to me). Signed-off-by: NMax Krasnyanskiy <maxk@qualcomm.com> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NGregory Haskins <ghaskins@novell.com> Cc: dmitry.adamushko@gmail.com Cc: pj@sgi.com Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 7月, 2008 1 次提交
-
-
由 Dmitry Adamushko 提交于
Commit f18f982a ("sched: CPU hotplug events must not destroy scheduler domains created by the cpusets") introduced a hotplug-related problem as described below: Upon CPU_DOWN_PREPARE, update_sched_domains() -> detach_destroy_domains(&cpu_online_map) does the following: /* * Force a reinitialization of the sched domains hierarchy. The domains * and groups cannot be updated in place without racing with the balancing * code, so we temporarily attach all running cpus to the NULL domain * which will prevent rebalancing while the sched domains are recalculated. */ The sched-domains should be rebuilt when a CPU_DOWN ops. has been completed, effectively either upon CPU_DEAD{_FROZEN} (upon success) or CPU_DOWN_FAILED{_FROZEN} (upon failure -- restore the things to their initial state). That's what update_sched_domains() also does but only for !CPUSETS case. With f18f982a, sched-domains' reinitialization is delegated to CPUSETS code: cpuset_handle_cpuhp() -> common_cpu_mem_hotplug_unplug() -> rebuild_sched_domains() Being called for CPU_UP_PREPARE and if its callback is called after update_sched_domains()), it just negates all the work done by update_sched_domains() -- i.e. a soon-to-be-offline cpu is included in the sched-domains and that makes it visible for the load-balancer while the CPU_DOWN ops. is in progress. __migrate_live_tasks() moves the tasks off a 'dead' cpu (it's already "offline" when this function is called). try_to_wake_up() is called for one of these tasks from another CPU -> the load-balancer (wake_idle()) picks up a "dead" CPU and places the task on it. Then e.g. BUG_ON(rq->nr_running) detects this a bit later -> oops. Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com> Tested-by: NVegard Nossum <vegard.nossum@gmail.com> Cc: Paul Menage <menage@google.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: miaox@cn.fujitsu.com Cc: rostedt@goodmis.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 6月, 2008 2 次提交
-
-
由 Li Zefan 提交于
We allow the inputs to be [-1 ... SD_LV_MAX), and return -EINVAL for inputs outside this range. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Acked-by: NPaul Jackson <pj@sgi.com> Acked-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Max Krasnyansky 提交于
First issue is not related to the cpusets. We're simply leaking doms_cur. It's allocated in arch_init_sched_domains() which is called for every hotplug event. So we just keep reallocation doms_cur without freeing it. I introduced free_sched_domains() function that cleans things up. Second issue is that sched domains created by the cpusets are completely destroyed by the CPU hotplug events. For all CPU hotplug events scheduler attaches all CPUs to the NULL domain and then puts them all into the single domain thereby destroying domains created by the cpusets (partition_sched_domains). The solution is simple, when cpusets are enabled scheduler should not create default domain and instead let cpusets do that. Which is exactly what the patch does. Signed-off-by: NMax Krasnyansky <maxk@qualcomm.com> Cc: pj@sgi.com Cc: menage@google.com Cc: rostedt@goodmis.org Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 10 6月, 2008 1 次提交
-
-
由 David Rientjes 提交于
Kthreads that have called kthread_bind() are bound to specific cpus, so other tasks should not be able to change their cpus_allowed from under them. Otherwise, it is possible to move kthreads, such as the migration or software watchdog threads, so they are not allowed access to the cpu they work on. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 6月, 2008 1 次提交
-
-
由 Lai Jiangshan 提交于
Adding a nonexistent cpu to a cpuset will be omitted quietly. It should return -EINVAL. Example: (real_nr_cpus <= 4 < NR_CPUS or cpu#4 was just offline) # cat cpus 0-1 # /bin/echo 4 > cpus # /bin/echo $? 0 # cat cpus # The same occurs when add a nonexistent mem. This patch will fix this bug. And when *buf == "", the check is unneeded. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NPaul Jackson <pj@sgi.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 6月, 2008 1 次提交
-
-
由 Max Krasnyansky 提交于
First issue is not related to the cpusets. We're simply leaking doms_cur. It's allocated in arch_init_sched_domains() which is called for every hotplug event. So we just keep reallocation doms_cur without freeing it. I introduced free_sched_domains() function that cleans things up. Second issue is that sched domains created by the cpusets are completely destroyed by the CPU hotplug events. For all CPU hotplug events scheduler attaches all CPUs to the NULL domain and then puts them all into the single domain thereby destroying domains created by the cpusets (partition_sched_domains). The solution is simple, when cpusets are enabled scheduler should not create default domain and instead let cpusets do that. Which is exactly what the patch does. Signed-off-by: NMax Krasnyansky <maxk@qualcomm.com> Cc: pj@sgi.com Cc: menage@google.com Cc: rostedt@goodmis.org Cc: mingo@elte.hu Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 09 5月, 2008 1 次提交
-
-
由 Paul Menage 提交于
Due to a merge conflict, the sched_relax_domain_level control file was marked as being handled by cpuset_read/write_u64, but the code to handle it was actually in cpuset_common_file_read/write. Since the value being written/read is in fact a signed integer, it should be treated as such; this patch adds cpuset_read/write_s64 functions, and uses them to handle the sched_relax_domain_level file. With this patch, the sched_relax_domain_level can be read and written, and the correct contents seen/updated. Signed-off-by: NPaul Menage <menage@google.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 4月, 2008 5 次提交
-
-
由 Paul Menage 提交于
This flag provides the hardwalling properties of mem_exclusive, without enforcing the exclusivity. Either mem_hardwall or mem_exclusive is sufficient to prevent GFP_KERNEL allocations from passing outside the cpuset's assigned nodes. Signed-off-by: NPaul Menage <menage@google.com> Acked-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
Currently the cpusets mem_exclusive flag is overloaded to mean both "no-overlapping" and "no GFP_KERNEL allocations outside this cpuset". These patches add a new mem_hardwall flag with just the allocation restriction part of the mem_exclusive semantics, without breaking backwards-compatibility for those who continue to use just mem_exclusive. Additionally, the cgroup control file registration for cpusets is cleaned up to reduce boilerplate. This patch: This change tidies up the cpusets control file definitions, and reduces the amount of boilerplate required to add/change control files in the future. Signed-off-by: NPaul Menage <menage@google.com> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
Make the following needlessly global functions static: - cpuset_test_cpumask() - cpuset_change_cpumask() - cpuset_do_move_task() Signed-off-by: NAdrian Bunk <bunk@kernel.org> Acked-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
Many of the cpusets control files are simple integer values, which don't require the overhead of memory allocations for reads and writes. Move the handlers for these control files into cpuset_read_u64() and cpuset_write_u64(). [akpm@linux-foundation.org: ad dmissing `break'] Signed-off-by: NPaul Menage <menage@google.com> Cc: "Li Zefan" <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: "YAMAMOTO Takashi" <yamamoto@valinux.co.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Harvey Harrison 提交于
kernel/cpuset.c:1268:52: warning: Using plain integer as NULL pointer kernel/pid_namespace.c:95:24: warning: Using plain integer as NULL pointer Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Reviewed-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-