- 20 4月, 2008 40 次提交
-
-
由 Peter Zijlstra 提交于
Now that the group hierarchy can have an arbitrary depth the O(n^2) nature of RT task dequeues will really hurt. Optimize this by providing space to store the tree path, so we can walk it the other way. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Add some extra debug output so we can get a better overview of the full hierarchy. We print the cgroup path after each cfs_rq, so we can see what group we're looking at. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Implement SMP nice support for the full group hierarchy. On each load-balance action, compile a sched_domain wide view of the full task_group tree. We compute the domain wide view when walking down the hierarchy, and readjust the weights when walking back up. After collecting and readjusting the domain wide view, we try to balance the tasks within the task_groups. The current approach is a naively balance each task group until we've moved the targeted amount of load. Inspired by Srivatsa Vaddsgiri's previous code and Abhishek Chandra's H-SMP paper. XXX: there will be some numerical issues due to the limited nature of SCHED_LOAD_SCALE wrt to representing a task_groups influence on the total weight. When the tree is deep enough, or the task weight small enough, we'll run out of bits. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> CC: Abhishek Chandra <chandra@cs.umn.edu> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Hidetoshi Seto 提交于
[rebased for sched-devel/latest] - Add a new cpuset file, having levels: sched_relax_domain_level - Modify partition_sched_domains() and build_sched_domains() to take attributes parameter passed from cpuset. - Fill newidle_idx for node domains which currently unused but might be required if sched_relax_domain_level become higher. - We can change the default level by boot option 'relax_domain_level='. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Hidetoshi Seto 提交于
This patch introduces new feature of cpuset - sched domain customization. This version provides a per-cpuset file 'sched_relax_domain_level' that enable us to change the searching range of scheduler, which used to limit how many cpus the scheduler searches at some schedule events, such as wakening task and running out of runqueue. Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
multi level rt constraints Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Add the full parent<->child relation thing into task_groups as well. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
UID grouping doesn't actually have a task_group representing the root of the task_group tree. Add one. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dhaval Giani 提交于
This patch makes the group scheduler multi hierarchy aware. [a.p.zijlstra@chello.nl: rt-parts and assorted fixes] Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dhaval Giani 提交于
This patch allows tasks and groups to exist in the same cfs_rq. With this change the CFS group scheduling follows a 1/(M+N) model from a 1/(1+N) fairness model where M tasks and N groups exist at the cfs_rq level. [a.p.zijlstra@chello.nl: rt bits and assorted fixes] Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
Add a new function that accepts a pointer to the "newly allowed cpus" cpumask argument. int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask) The current set_cpus_allowed() function is modified to use the above but this does not result in an ABI change. And with some compiler optimization help, it may not introduce any additional overhead. Additionally, to enforce the read only nature of the new_mask arg, the "const" property is migrated to sub-functions called by set_cpus_allowed. This silences compiler warnings. Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
Move the setting of nr_cpu_ids from sched_init() to start_kernel() so that it's available as early as possible. Note that an arch has the option of setting it even earlier if need be, but it should not result in a different value than the setup_nr_cpu_ids() function. Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Remove another cpumask_t variable from stack that was missed in the last kernel_sched_c updates. Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Add cpu_sysdev_class functions to display the following maps with cpulist_scnprintf(). cpu_online_map cpu_present_map cpu_possible_map * Small change to include/linux/sysdev.h to allow the attribute name and label to be different (to avoid collision with the "attr_online" entry for bringing cpus on- and off-line.) Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Cleaned up references to cpumask_scnprintf() and added new cpulist_scnprintf() interfaces where appropriate. * Fix some small bugs (or code efficiency improvments) for various uses of cpumask_scnprintf. * Clean up some checkpatch errors. Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Removed kmalloc (or local array) in show_shared_cpu_map(). * Added show_shared_cpu_list() function. Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Here is a simple patch to use an allocated array of cpumasks to represent cpumask_of_cpu() instead of constructing one on the stack. It's based on the Kconfig option "HAVE_CPUMASK_OF_CPU_MAP" which is currently only set for x86_64 SMP. Otherwise the the existing cpumask_of_cpu() is used but has been changed to produce an lvalue so a pointer to it can be used. Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Add a static cpumask_t variable "CPU_MASK_ALL_PTR" to use as a pointer reference to CPU_MASK_ALL. This reduces where possible the instances where CPU_MASK_ALL allocates and fills a large array on the stack. Used only if NR_CPUS > BITS_PER_LONG. * Change init/main.c to use new set_cpus_allowed_ptr(). Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Remove empty cpumask_t (and all non-zero/non-null) variables in SD_*_INIT macros. Use memset(0) to clear. Also, don't inline the initializer functions to save on stack space in build_sched_domains(). * Merge change to include/linux/topology.h that uses the new node_to_cpumask_ptr function in the nr_cpus_node macro into this patch. Depends on: [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Use new node_to_cpumask_ptr. This creates a pointer to the cpumask for a given node. This definition is in mm patch: asm-generic-add-node_to_cpumask_ptr-macro.patch * Use new set_cpus_allowed_ptr function. Depends on: [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch [sched-devel]: sched: add new set_cpus_allowed_ptr function [x86/latest]: x86: add cpus_scnprintf function Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Greg Banks <gnb@melbourne.sgi.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Modify sched_affinity functions to pass cpumask_t variables by reference instead of by value. * Use new set_cpus_allowed_ptr function. Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: Paul Jackson <pj@sgi.com> Cc: Cliff Wickman <cpw@sgi.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Modify cpuset_cpus_allowed to return the currently allowed cpuset via a pointer argument instead of as the function return value. * Use new set_cpus_allowed_ptr function. * Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses. Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Use new set_cpus_allowed_ptr() function added by previous patch, which instead of passing the "newly allowed cpus" cpumask_t arg by value, pass it by pointer: -int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask) +int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask) * Modify CPU_MASK_ALL Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Use new set_cpus_allowed_ptr() function added by previous patch, which instead of passing the "newly allowed cpus" cpumask_t arg by value, pass it by pointer: -int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask) +int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask) * Cleanup uses of CPU_MASK_ALL. * Collapse other NR_CPUS changes to arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c Use pointers to cpumask_t arguments whenever possible. Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: Len Brown <len.brown@intel.com> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Change fixed size arrays to per_cpu variables or dynamically allocated arrays in sched_init() and sched_init_smp(). (1) static struct sched_entity *init_sched_entity_p[NR_CPUS]; (1) static struct cfs_rq *init_cfs_rq_p[NR_CPUS]; (1) static struct sched_rt_entity *init_sched_rt_entity_p[NR_CPUS]; (1) static struct rt_rq *init_rt_rq_p[NR_CPUS]; static struct sched_group **sched_group_nodes_bycpu[NR_CPUS]; (1) - these arrays are allocated via alloc_bootmem_low() * Change sched_domain_debug_one() to use cpulist_scnprintf instead of cpumask_scnprintf. This reduces the output buffer required and improves readability when large NR_CPU count machines arrive. * In sched_create_group() we allocate new arrays based on nr_cpu_ids. Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Replace usages of CPU_MASK_NONE, CPU_MASK_ALL, NODE_MASK_NONE, NODE_MASK_ALL to reduce stack requirements for large NR_CPUS and MAXNODES counts. * In some cases, the cpumask variable was initialized but then overwritten with another value. This is the case for changes like this: - cpumask_t oldmask = CPU_MASK_ALL; + cpumask_t oldmask; Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Move large array "struct bootnode nodes" from stack to _initdata section to reduce amount of stack space required. Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
Create a simple macro to always return a pointer to the node_to_cpumask(node) value. This relies on compiler optimization to remove the extra indirection: #define node_to_cpumask_ptr(v, node) \ cpumask_t _##v = node_to_cpumask(node), *v = &_##v For those systems with a large cpumask size, then a true pointer to the array element can be used: #define node_to_cpumask_ptr(v, node) \ cpumask_t *v = &(node_to_cpumask_map[node]) A node_to_cpumask_ptr_next() macro is provided to access another node_to_cpumask value. The other change is to always include asm-generic/topology.h moving the ifdef CONFIG_NUMA to this same file. Note: there are no references to either of these new macros in this patch, only the definition. Based on 2.6.25-rc5-mm1 # alpha Cc: Richard Henderson <rth@twiddle.net> # fujitsu Cc: David Howells <dhowells@redhat.com> # ia64 Cc: Tony Luck <tony.luck@intel.com> # powerpc Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> # sparc Cc: David S. Miller <davem@davemloft.net> Cc: William L. Irwin <wli@holomorphy.com> # x86 Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
Change the following arrays sized by NR_CPUS to be PERCPU variables: static struct op_msrs cpu_msrs[NR_CPUS]; static unsigned long saved_lvtpc[NR_CPUS]; Also some minor complaints from checkpatch.pl fixed. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git All changes were transparent except for: static void nmi_shutdown(void) { + struct op_msrs *msrs = &__get_cpu_var(cpu_msrs); nmi_enabled = 0; on_each_cpu(nmi_cpu_shutdown, NULL, 0, 1); unregister_die_notifier(&profile_exceptions_nb); - model->shutdown(cpu_msrs); + model->shutdown(msrs); free_msrs(); } The existing code passed a reference to cpu 0's instance of struct op_msrs to model->shutdown, whilst the other functions are passed a reference to <this cpu's> instance of a struct op_msrs. This seemed to be a bug to me even though as long as cpu 0 and <this cpu> are of the same type it would have the same effect...? Cc: Philippe Elie <phil.el@wanadoo.fr> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
* Change the following static arrays sized by NR_CPUS to per_cpu data variables: _cpuid4_info *cpuid4_info[NR_CPUS]; _index_kobject *index_kobject[NR_CPUS]; kobject * cache_kobject[NR_CPUS]; * Remove the local NR_CPUS array with a kmalloc'd region in show_shared_cpu_map(). Also some minor complaints from checkpatch.pl fixed. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Travis 提交于
Add a new function cpumask_scnprintf_len() to return the number of characters needed to display "len" cpumask bits. The current method of allocating NR_CPUS bytes is incorrect as what's really needed is 9 characters per 32-bit word of cpumask bits (8 hex digits plus the seperator [','] or the terminating NULL.) This function provides the caller the means to allocate the correct string length. Cc: Paul Jackson <pj@sgi.com> Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Gregory Haskins 提交于
Signed-off-by: NGregory Haskins <ghaskins@novell.com> Acked-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dhaval Giani 提交于
Currently the schedstats implementation does not allow the statistics to be reset. This patch aims to allow that. echo 0 > cpuacct.usage resets the usage. Any other value is not allowed and returns -EINVAL. Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Dhaval Giani 提交于
Change the variable names to the common convention for the cpuacct subsystem. Signed-off-by: NDhaval Giani <dhaval@linux.vnet.ibm.com> Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Olof Johansson 提交于
I noticed this when looking at an openswan issue. Openswan (ab?)uses the tasklet API to defer processing of packets in some situations, with one packet per tasklet_action(). I started noticing sequences of backwards-ordered sequence numbers coming over the wire, since new tasklets are always queued at the head of the list but processed sequentially. Convert it to instead append new entries to the tail of the list. As an extra bonus, the splicing code in takeover_tasklets() no longer has to iterate over the list. Signed-off-by: NOlof Johansson <olof@lixom.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Currently the rt group scheduling does a per cpu runtime limit, however the rt load balancer makes no guarantees about an equal spread of real- time tasks, just that at any one time, the highest priority tasks run. Solve this by making the runtime limit a global property by borrowing excessive runtime from the other cpus once the local limit runs out. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Various SMP balancing algorithms require that the bandwidth period run in sync. Possible improvements are moving the rt_bandwidth thing into root_domain and keeping a span per rt_bandwidth which marks throttled cpus. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-