- 09 1月, 2009 24 次提交
-
-
由 Li Zefan 提交于
Impact: cleanups, use new cpumask API Final trivial cleanups: mainly s/cpumask_t/struct cpumask Note there is a FIXME in generate_sched_domains(). A future patch will change struct cpumask *doms to struct cpumask *doms[]. (I suppose Rusty will do this.) Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Impact: use new cpumask API This patch mainly does the following things: - change cs->cpus_allowed from cpumask_t to cpumask_var_t - call alloc_bootmem_cpumask_var() for top_cpuset in cpuset_init_early() - call alloc_cpumask_var() for other cpusets - replace cpus_xxx() to cpumask_xxx() Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Impact: cleanups, reduce stack usage This patch prepares for the next patch. When we convert cpuset.cpus_allowed to cpumask_var_t, (trialcs = *cs) no longer works. Another result of this patch is reducing stack usage of trialcs. sizeof(*cs) can be as large as 148 bytes on x86_64, so it's really not good to have it on stack. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Impact: reduce stack usage Allocate a global cpumask_var_t at boot, and use it in cpuset_attach(), so we won't fail cpuset_attach(). Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Impact: reduce stack usage Just use cs->cpus_allowed, and no need to allocate a cpumask_var_t. Signed-off-by: NLi Zefan <lizf@cn.fujistu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
This patchset converts cpuset to use new cpumask API, and thus remove on stack cpumask_t to reduce stack usage. Before: # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t 21 After: # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t 0 This patch: Impact: reduce stack usage It's safe to call cpulist_scnprintf inside callback_mutex, and thus we can just remove the cpumask_t and no need to allocate a cpumask_var_t. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: NMike Travis <travis@sgi.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miao Xie 提交于
I found a bug on my dual-cpu box. I created a sub cpuset in top cpuset and assign 1 to its cpus. And then we attach some tasks into this sub cpuset. After this, we offline CPU1. Now, the tasks in this new cpuset are moved into top cpuset automatically because there is no cpu in sub cpuset. Then we online CPU1, we find all the tasks which doesn't belong to top cpuset originally just run on CPU0. We fix this bug by setting task's cpu_allowed to cpu_possible_map when attaching it into top cpuset. This method needn't modify the current behavior of cpusets on CPU hotplug, and all of tasks in top cpuset use cpu_possible_map to initialize their cpu_allowed. Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
task_cs() calls task_subsys_state(). We must use rcu_read_lock() to protect cgroup_subsys_state(). It's correct that top_cpuset is never freed, but cgroup_subsys_state() accesses css_set, this css_set maybe freed when task_cs() called. We use use rcu_read_lock() to protect it. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
Add css_tryget(), that obtains a counted reference on a CSS. It is used in situations where the caller has a "weak" reference to the CSS, i.e. one that does not protect the cgroup from removal via a reference count, but would instead be cleaned up by a destroy() callback. css_tryget() will return true on success, or false if the cgroup is being removed. This is similar to Kamezawa Hiroyuki's patch from a week or two ago, but with the difference that in the event of css_tryget() racing with a cgroup_rmdir(), css_tryget() will only return false if the cgroup really does get removed. This implementation is done by biasing css->refcnt, so that a refcnt of 1 means "releasable" and 0 means "released or releasing". In the event of a race, css_tryget() distinguishes between "released" and "releasing" by checking for the CSS_REMOVED flag in css->flags. Signed-off-by: NPaul Menage <menage@google.com> Tested-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
These patches introduce new locking/refcount support for cgroups to reduce the need for subsystems to call cgroup_lock(). This will ultimately allow the atomicity of cgroup_rmdir() (which was removed recently) to be restored. These three patches give: 1/3 - introduce a per-subsystem hierarchy_mutex which a subsystem can use to prevent changes to its own cgroup tree 2/3 - use hierarchy_mutex in place of calling cgroup_lock() in the memory controller 3/3 - introduce a css_tryget() function similar to the one recently proposed by Kamezawa, but avoiding spurious refcount failures in the event of a race between a css_tryget() and an unsuccessful cgroup_rmdir() Future patches will likely involve: - using hierarchy mutex in place of cgroup_lock() in more subsystems where appropriate - restoring the atomicity of cgroup_rmdir() with respect to cgroup_create() This patch: Add a hierarchy_mutex to the cgroup_subsys object that protects changes to the hierarchy observed by that subsystem. It is taken by the cgroup subsystem (in addition to cgroup_mutex) for the following operations: - linking a cgroup into that subsystem's cgroup tree - unlinking a cgroup from that subsystem's cgroup tree - moving the subsystem to/from a hierarchy (including across the bind() callback) Thus if the subsystem holds its own hierarchy_mutex, it can safely traverse its own hierarchy. Signed-off-by: NPaul Menage <menage@google.com> Tested-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Balbir Singh 提交于
Add support for building hierarchies in resource counters. Cgroups allows us to build a deep hierarchy, but we currently don't link the resource counters belonging to the memory controller control groups, in the same fashion as the corresponding cgroup entries in the cgroup hierarchy. This patch provides the infrastructure for resource counters that have the same hiearchy as their cgroup counter parts. These set of patches are based on the resource counter hiearchy patches posted by Pavel Emelianov. NOTE: Building hiearchies is expensive, deeper hierarchies imply charging the all the way up to the root. It is known that hiearchies are expensive, so the user needs to be careful and aware of the trade-offs before creating very deep ones. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
Fix races between /proc/sched_debug by freeing cgroup objects via an RCU callback. Thus any cgroup reference obtained from an RCU-safe source will remain valid during the RCU section. Since dentries are also RCU-safe, this allows us to traverse up the tree safely. Additionally, make cgroup_path() check for a NULL cgrp->dentry to avoid trying to report a path for a partially-created cgroup. [lizf@cn.fujitsu.com: call deactive_super() in cgroup_diput()] Signed-off-by: NPaul Menage <menage@google.com> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Tested-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gowrishankar M 提交于
Once tasks are populated from system namespace inside cgroup, container replaces other namespace task with 0 while listing tasks, inside container. Though this is expected behaviour from container end, there is no use of showing unwanted 0s. In this patch, we check if a process is in same namespace before loading into pid array. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NGowrishankar M <gowrishankar.m@in.ibm.com> Acked-by: NPaul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Add a common function link_css_set() to link a css_set to a cgroup. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Though for an inactive hierarchy, we have subsys->root == &rootnode, but rootnode's subsys_list is always empty. This conflicts with the code in find_css_set(): for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) { ... if (ss->root->subsys_list.next == &ss->sibling) { ... } } if (list_empty(&rootnode.subsys_list)) { ... } The above code assumes rootnode.subsys_list links all inactive hierarchies. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Don't link rootnode to the root list, so root_list contains active hierarchies only as the comment indicates. And rename for_each_root() to for_each_active_root(). Also remove redundant check in cgroup_kill_sb(). Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
cgroup_iter_* do not need rcu_read_lock(). In cgroup_enable_task_cg_lists(), do_each_thread() and while_each_thread() are protected by RCU, it's OK, for write_lock(&css_set_lock) implies rcu_read_lock() in non-RT kernel. If we need explicit rcu_read_lock(), we should add rcu_read_lock() in cgroup_enable_task_cg_lists(), not cgroup_iter_*. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
In cgroup_attach_task(), tsk maybe exit when we call find_css_set(). and find_css_set() will access to invalid css_set. This patch increases the count before get_css_set(), and decreases it after find_css_set(). NOTE: css_set's refcount is also taskcount, after this patch applied, taskcount may be off-by-one WHEN cgroup_lock() is not held. but I reviewed other code which use taskcount, they are still correct. No regression found by reviewing and simply testing. So I do not use two counters in css_set. (one counter for taskcount, the other for refcount. like struct mm_struct) If this fix cause regression, we will use two counters in css_set. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
Use task_lock() protect tsk->cgroups and get_css_set(tsk->cgroups). Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
We don't access struct cgroupfs_root in fast path, so we should not put struct cgroupfs_root protected by RCU But the comment in struct cgroup_subsys.root confuse us. struct cgroup_subsys.root is used in these places: 1 find_css_set(): if (ss->root->subsys_list.next == &ss->sibling) 2 rebind_subsystems(): if (ss->root != &rootnode) rcu_assign_pointer(ss->root, root); rcu_assign_pointer(subsys[i]->root, &rootnode); 3 cgroup_has_css_refs(): if (ss->root != cgrp->root) 4 cgroup_init_subsys(): ss->root = &rootnode; 5 proc_cgroupstats_show(): ss->name, ss->root->subsys_bits, ss->root->number_of_cgroups, !ss->disabled); 6 cgroup_clone(): root = subsys->root; if ((root != subsys->root) || All these place we have held cgroup_lock() or we don't dereference to struct cgroupfs_root. It's means wo don't need RCU when use struct cgroup_subsys.root, and we should not put struct cgroupfs_root protected by RCU. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
We access res->cgroups without the task_lock(), so res->cgroups may be changed. it's unreliable, and "if (l == &res->cgroups->tasks)" may be false forever. We don't need add any lock for fixing this bug. we just access to struct css_set by struct cg_cgroup_link, not by struct task_struct. Since we hold css_set_lock, struct cg_cgroup_link is reliable. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
When cgroup_post_fork() is called, child is seen by find_task_by_vpid(), so child->cgroups maybe be changed, It'll incorrect. child->cgroups<old>'s refcnt is decreased child->cgroups<new>'s refcnt is increased but child->cg_list is added to child->cgroups<old>'s list. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
I happened to find the spinlock in struct ns_cgroup is never used. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
- In cgroup_clone(), if vfs_mkdir() returns successfully, dentry->d_fsdata will be the pointer to the newly created cgroup and won't be NULL. - a cgroup file's dentry->d_fsdata won't be NULL, guaranteed by cgroup_add_file(). - When walking through the subsystems of a cgroup_fs (using for_each_subsys), cgrp->subsys[ss->subsys_id] won't be NULL, guaranteed by cgroup_create(). (Also remove 2 unused variables in cgroup_rmdir(). Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 1月, 2009 4 次提交
-
-
由 Heiko Carstens 提交于
disable_nonboot_cpus calls _cpu_down. But _cpu_down requires that the caller already created the stop_machine workqueue (like cpu_down does). Otherwise a call to stop_machine will lead to accesses to random memory regions. When introducing this new interface (9ea09af3 "stop_machine: introduce stop_machine_create/destroy") I missed the second call site of _cpu_down. So add the missing stop_machine_create/destroy calls to disable_nonboot_cpus as well. Fixes suspend-to-ram/disk and also this bug: [ 286.547348] BUG: unable to handle kernel paging request at 6b6b6b6b [ 286.548940] IP: [<c0150ca4>] __stop_machine+0x88/0xe3 [ 286.550598] Oops: 0002 [#1] SMP [ 286.560580] Pid: 3273, comm: halt Not tainted (2.6.28-06127-g238c6d54 [ 286.560580] EIP: is at __stop_machine+0x88/0xe3 [ 286.560580] Process halt (pid: 3273, ti=f1a28000 task=f4530f30 [ 286.560580] Call Trace: [ 286.560580] [<c03d04e4>] ? _cpu_down+0x10f/0x234 [ 286.560580] [<c012a57e>] ? disable_nonboot_cpus+0x58/0xdc [ 286.560580] [<c01360c0>] ? kernel_poweroff+0x22/0x39 [ 286.560580] [<c0136301>] ? sys_reboot+0xde/0x14c [ 286.560580] [<c01331b2>] ? complete_signal+0x179/0x191 [ 286.560580] [<c0133396>] ? send_signal+0x1cc/0x1e1 [ 286.560580] [<c03de418>] ? _spin_unlock_irqrestore+0x2d/0x3c [ 286.560580] [<c0133b65>] ? group_send_signal_info+0x58/0x61 [ 286.560580] [<c0133b9e>] ? kill_pid_info+0x30/0x3a [ 286.560580] [<c0133d49>] ? sys_kill+0x75/0x13a [ 286.560580] [<c01a06cb>] ? mntput_no_expire+ox1f/0x101 [ 286.560580] [<c019b3b3>] ? dput+0x1e/0x105 [ 286.560580] [<c018ef87>] ? __fput+0x150/0x158 [ 286.560580] [<c0157abf>] ? audit_syscall_entry+0x137/0x159 [ 286.560580] [<c010329f>] ? sysenter_do_call+0x12/0x34 Reported-and-tested-by: N"Justin P. Mattock" <justinmattock@gmail.com> Reviewed-by: NPekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Tested-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Arjan van de Ven 提交于
Device drivers that use pci_request_regions() (and similar APIs) have a reasonable expectation that they are the only ones accessing their device. As part of the e1000e hunt, we were afraid that some userland (X or some bootsplash stuff) was mapping the MMIO region that the driver thought it had exclusively via /dev/mem or via various sysfs resource mappings. This patch adds the option for device drivers to cause their reserved regions to the "banned from /dev/mem use" list, so now both kernel memory and device-exclusive MMIO regions are banned. NOTE: This is only active when CONFIG_STRICT_DEVMEM is set. In addition to the config option, a kernel parameter iomem=relaxed is provided for the cases where developers want to diagnose, in the field, drivers issues from userspace. Reviewed-by: NMatthew Wilcox <willy@linux.intel.com> Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
由 Arjan van de Ven 提交于
while tracking the asynchronous calls during boot using the initcall_debug convention is useful, doing it once the kernel is done is actually bad now that we use asynchronous operations post boot as well... Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
-
由 Arjan van de Ven 提交于
Right now, most of the kernel boot is strictly synchronous, such that various hardware delays are done sequentially. In order to make the kernel boot faster, this patch introduces infrastructure to allow doing some of the initialization steps asynchronously, which will hide significant portions of the hardware delays in practice. In order to not change device order and other similar observables, this patch does NOT do full parallel initialization. Rather, it operates more in the way an out of order CPU does; the work may be done out of order and asynchronous, but the observable effects (instruction retiring for the CPU) are still done in the original sequence. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
-
- 07 1月, 2009 12 次提交
-
-
由 Peter Zijlstra 提交于
Vaidyanathan Srinivasan reported: > ============================================= > [ INFO: possible recursive locking detected ] > 2.6.28-autotest-tip-sv #1 > --------------------------------------------- > klogd/5062 is trying to acquire lock: > (&rq->lock){++..}, at: [<ffffffff8022aca2>] task_rq_lock+0x45/0x7e > > but task is already holding lock: > (&rq->lock){++..}, at: [<ffffffff805f7354>] schedule+0x158/0xa31 With sched_mc at 2. (it is default-off) Strictly speaking we'll not deadlock, because ttwu will not be able to place the migration task on our rq, but since the code can deal with both rqs getting unlocked, this seems the easiest way out. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Johannes Weiner 提交于
Prevent passing an order to bitmap_find_free_region() that is larger than the actual bitmap can represent. These requests can come from device drivers that have no idea how big the dma region is and need to rely on dma_alloc_from_coherent() to sort it out for them. Reported-by: NGuennadi Liakhovetski <lg@denx.de> Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Dmitry Baryshkov <dbaryshkov@gmail.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
This thing was rather stupidly coded. Rework it all prior to making changes. Also, rename local variable `page': kernel readers expect something called `page' to have type `struct page *'. Cc: Guennadi Liakhovetski <lg@denx.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Dmitry Baryshkov <dbaryshkov@gmail.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
If bitmap_find_free_region() fails and DMA_MEMORY_EXCLUSIVE is not set, the function will fail to write anything to *ret and will return 1. This will cause dma_alloc_coherent() to return an uninitialised value, crashing the kernel, perhaps via DMA to a random address. Fix that by changing it to return zero in this case, so the caller will proceed to allocate the memory from the generic memory allocator. Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Dmitry Baryshkov <dbaryshkov@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hidehiro Kawai 提交于
Introduce a new kernel parameter `coredump_filter'. Setting a value to this parameter causes the default bitmask of coredump_filter to be changed. It is useful for users to change coredump_filter settings for the whole system at boot time. Without this parameter, users have to change coredump_filter settings for each /proc/<pid>/ in an initializing script. Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Roland McGrath <roland@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sukadev Bhattiprolu 提交于
POSIX requires the si_pid to be the process id of the sender, so ->si_pid should really be set to 'tgid'. This change does have following changes in behavior: - When sending pdeath_signal on re-parent to a sub-thread, ->si_pid cannot be used to identify the thread that did the re-parent since it will now show the tgid instead of thread id. - A multi-threaded application that expects to find the specific thread that encountered a SIGPIPE using the ->si_pid will now break. Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-By: NRoland McGrath <roland@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sukadev Bhattiprolu 提交于
For SEND_SIG_NOINFO, si_pid is currently set to the pid of sender in sender's active pid namespace. But if the receiver is in a Eg: when parent sends the 'pdeath_signal' to a child that is in a descendant pid namespace, we should set si_pid 0. Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-By: NRoland McGrath <roland@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Randy Dunlap 提交于
Fix varargs kernel-doc format in kmod.c: Use @... instead of @varargs. Warning(kernel/kmod.c:67): Excess function parameter or struct member 'varargs' description in 'request_module' Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masami Hiramatsu 提交于
Allow kprobes to probe module __init routines. When __init functions are freed, kprobes which probe those functions are set to "Gone" flag. These "Gone" probes are disarmed from the code and never be enabled. Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com> Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masami Hiramatsu 提交于
Add a module notifier call which notifies that the state of a module changes from MODULE_STATE_COMING to MODULE_STATE_LIVE. Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masami Hiramatsu 提交于
Remove called_from argument from kprobes which had been used for preventing self-refering of kernel module. However, since we don't keep module's refcount after registering kprobe any more, there is no reason to check that. This patch also simplifies registering/unregistering functions because we don't need to use __builtin_return_address(0) which was passed to called_from. [ananth@in.ibm.com: build fix] Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com> Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masami Hiramatsu 提交于
Allows kprobes to probe __exit routine. This adds flags member to struct kprobe. When module is freed(kprobes hooks module_notifier to get this event), kprobes which probe the functions in that module are set to "Gone" flag to the flags member. These "Gone" probes are never be enabled. Users can check the GONE flag through debugfs. This also removes mod_refcounted, because we couldn't free a module if kprobe incremented the refcount of that module. [akpm@linux-foundation.org: document some locking] [mhiramat@redhat.com: bugfix: pass aggr_kprobe to arch_remove_kprobe] [mhiramat@redhat.com: bugfix: release old_p's insn_slot before error return] Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com> Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-