- 09 1月, 2009 2 次提交
-
-
由 Lai Jiangshan 提交于
When cgroup_post_fork() is called, child is seen by find_task_by_vpid(), so child->cgroups maybe be changed, It'll incorrect. child->cgroups<old>'s refcnt is decreased child->cgroups<new>'s refcnt is increased but child->cg_list is added to child->cgroups<old>'s list. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
- In cgroup_clone(), if vfs_mkdir() returns successfully, dentry->d_fsdata will be the pointer to the newly created cgroup and won't be NULL. - a cgroup file's dentry->d_fsdata won't be NULL, guaranteed by cgroup_add_file(). - When walking through the subsystems of a cgroup_fs (using for_each_subsys), cgrp->subsys[ss->subsys_id] won't be NULL, guaranteed by cgroup_create(). (Also remove 2 unused variables in cgroup_rmdir(). Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 1月, 2009 1 次提交
-
-
由 Hugh Dickins 提交于
cgroup_mm_owner_callbacks() was brought in to support the memrlimit controller, but sneaked into mainline ahead of it. That controller has now been shelved, and the mm_owner_changed() args were inadequate for it anyway (they needed an mm pointer instead of a task pointer). Remove the dead code, and restore mm_update_next_owner() locking to how it was before: taking mmap_sem there does nothing for memcontrol.c, now the only user of mm->owner. Signed-off-by: NHugh Dickins <hugh@veritas.com> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 1月, 2009 1 次提交
-
-
由 Al Viro 提交于
... and don't bother in callers. Don't bother with zeroing i_blocks, while we are at it - it's already been zeroed. i_mode is not worth the effort; it has no common default value. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 05 1月, 2009 1 次提交
-
-
由 Li Zefan 提交于
The race is calling cgroup_clone() while umounting the ns cgroup subsys, and thus cgroup_clone() might access invalid cgroup_fs, or kill_sb() is called after cgroup_clone() created a new dir in it. The BUG I triggered is BUG_ON(root->number_of_cgroups != 1); ------------[ cut here ]------------ kernel BUG at kernel/cgroup.c:1093! invalid opcode: 0000 [#1] SMP ... Process umount (pid: 5177, ti=e411e000 task=e40c4670 task.ti=e411e000) ... Call Trace: [<c0493df7>] ? deactivate_super+0x3f/0x51 [<c04a3600>] ? mntput_no_expire+0xb3/0xdd [<c04a3ab2>] ? sys_umount+0x265/0x2ac [<c04a3b06>] ? sys_oldumount+0xd/0xf [<c0403911>] ? sysenter_do_call+0x12/0x31 ... EIP: [<c0456e76>] cgroup_kill_sb+0x23/0xe0 SS:ESP 0068:e411ef2c ---[ end trace c766c1be3bf944ac ]--- Cc: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 12月, 2008 2 次提交
-
-
由 Li Zefan 提交于
If cgroup_get_rootdir() failed, free_cg_links() will be called in the failure path, but tmp_cg_links hasn't been initialized at that time. I introduced this bug in the 2.6.27 merge window. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NSerge Hallyn <serue@us.ibm.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sharyathi Nagesh 提交于
Remove spurious warning messages that are thrown onto the console during cgroup operations. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NSharyathi Nagesh <sharyathi@in.ibm.com> Acked-by: NSerge E. Hallyn <serge@hallyn.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 12月, 2008 1 次提交
-
-
由 Paul Menage 提交于
When a cgroup is removed, it's unlinked from its parent's children list, but not actually freed until the last dentry on it is released (at which point cgrp->root->number_of_cgroups is decremented). Currently rebind_subsystems checks for the top cgroup's child list being empty in order to rebind subsystems into or out of a hierarchy - this can result in the set of subsystems bound to a hierarchy being removed-but-not-freed cgroup. The simplest fix for this is to forbid remounts that change the set of subsystems on a hierarchy that has removed-but-not-freed cgroups. This bug can be reproduced via: mkdir /mnt/cg mount -t cgroup -o ns,freezer cgroup /mnt/cg mkdir /mnt/cg/foo sleep 1h < /mnt/cg/foo & rmdir /mnt/cg/foo mount -t cgroup -o remount,ns,devices,freezer cgroup /mnt/cg kill $! Though the above will cause oops in -mm only but not mainline, but the bug can cause memory leak in mainline (and even oops) Signed-off-by: NPaul Menage <menage@google.com> Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 11月, 2008 2 次提交
-
-
由 Li Zefan 提交于
Try this, and you'll get oops immediately: # cd Documentation/accounting/ # gcc -o getdelays getdelays.c # mount -t cgroup -o debug xxx /mnt # ./getdelays -C /mnt/tasks Because a normal file's dentry->d_fsdata is a pointer to struct cftype, not struct cgroup. After the patch, it returns EINVAL if we try to get cgroupstats from a normal file. Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
As Balbir pointed out, memcg's pre_destroy handler has potential deadlock. It has following lock sequence. cgroup_mutex (cgroup_rmdir) -> pre_destroy -> mem_cgroup_pre_destroy-> force_empty -> cpu_hotplug.lock. (lru_add_drain_all-> schedule_work-> get_online_cpus) But, cpuset has following. cpu_hotplug.lock (call notifier) -> cgroup_mutex. (within notifier) Then, this lock sequence should be fixed. Considering how pre_destroy works, it's not necessary to holding cgroup_mutex() while calling it. As a side effect, we don't have to wait at this mutex while memcg's force_empty works.(it can be long when there are tons of pages.) Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 11月, 2008 3 次提交
-
-
由 David Howells 提交于
Use RCU to access another task's creds and to release a task's own creds. This means that it will be possible for the credentials of a task to be replaced without another task (a) requiring a full lock to read them, and (b) seeing deallocated memory. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NJames Morris <jmorris@namei.org> Acked-by: NSerge Hallyn <serue@us.ibm.com> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 David Howells 提交于
Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NJames Morris <jmorris@namei.org> Acked-by: NSerge Hallyn <serue@us.ibm.com> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 David Howells 提交于
Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NJames Morris <jmorris@namei.org> Acked-by: NSerge Hallyn <serue@us.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-audit@redhat.com Cc: containers@lists.linux-foundation.org Cc: linux-mm@kvack.org Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 07 11月, 2008 1 次提交
-
-
由 Li Zefan 提交于
This fixes an oops when reading /proc/sched_debug. A cgroup won't be removed completely until finishing cgroup_diput(), so we shouldn't invalidate cgrp->dentry in cgroup_rmdir(). Otherwise, when a group is being removed while cgroup_path() gets called, we may trigger NULL dereference BUG. The bug can be reproduced: # cat test.sh #!/bin/sh mount -t cgroup -o cpu xxx /mnt for (( ; ; )) { mkdir /mnt/sub rmdir /mnt/sub } # ./test.sh & # cat /proc/sched_debug BUG: unable to handle kernel NULL pointer dereference at 00000038 IP: [<c045a47f>] cgroup_path+0x39/0x90 ... Call Trace: [<c0420344>] ? print_cfs_rq+0x6e/0x75d [<c0421160>] ? sched_debug_show+0x72d/0xc1e ... Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 10月, 2008 1 次提交
-
-
由 Stephen Rothwell 提交于
/scratch/sfr/next/kernel/cgroup.c: In function 'cgroup_tasks_start': /scratch/sfr/next/kernel/cgroup.c:2107: warning: unused variable 'i' Introduced in commit cc31edce "cgroups: convert tasks file to use a seq_file with shared pid array". Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 10月, 2008 2 次提交
-
-
由 Paul Menage 提交于
Rather than pre-generating the entire text for the "tasks" file each time the file is opened, we instead just generate/update the array of process ids and use a seq_file to report these to userspace. All open file handles on the same "tasks" file can share a pid array, which may be updated any time that no thread is actively reading the array. By sharing the array, the potential for userspace to DoS the system by opening many handles on the same "tasks" file is removed. [Based on a patch by Lai Jiangshan, extended to use seq_file] Signed-off-by: NPaul Menage <menage@google.com> Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Lai Jiangshan 提交于
put_css_set_taskexit may be called when find_css_set is called on other cpu. And the race will occur: put_css_set_taskexit side find_css_set side | atomic_dec_and_test(&kref->refcount) | /* kref->refcount = 0 */ | .................................................................... | read_lock(&css_set_lock) | find_existing_css_set | get_css_set | read_unlock(&css_set_lock); .................................................................... __release_css_set | .................................................................... | /* use a released css_set */ | [put_css_set is the same. But in the current code, all put_css_set are put into cgroup mutex critical region as the same as find_css_set.] [akpm@linux-foundation.org: repair comments] [menage@google.com: eliminate race in css_set refcounting] Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NPaul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 10月, 2008 1 次提交
-
-
由 Balbir Singh 提交于
This patch adds an additional field to the mm_owner callbacks. This field is required to get to the mm that changed. Hold mmap_sem in write mode before calling the mm_owner_changed callback [hugh@veritas.com: fix mmap_sem deadlock] Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 9月, 2008 1 次提交
-
-
由 Balbir Singh 提交于
There's a race between mm->owner assignment and swapoff, more easily seen when task slab poisoning is turned on. The condition occurs when try_to_unuse() runs in parallel with an exiting task. A similar race can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats> or ptrace or page migration. CPU0 CPU1 try_to_unuse looks at mm = task0->mm increments mm->mm_users task 0 exits mm->owner needs to be updated, but no new owner is found (mm_users > 1, but no other task has task->mm = task0->mm) mm_update_next_owner() leaves mmput(mm) decrements mm->mm_users task0 freed dereferencing mm->owner fails The fix is to notify the subsystem via mm_owner_changed callback(), if no new owner is found, by specifying the new task as NULL. Jiri Slaby: mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but must be set after that, so as not to pass NULL as old owner causing oops. Daisuke Nishimura: mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task() and its callers need to take account of this situation to avoid oops. Hugh Dickins: Lockdep warning and hang below exec_mmap() when testing these patches. exit_mm() up_reads mmap_sem before calling mm_update_next_owner(), so exec_mmap() now needs to do the same. And with that repositioning, there's now no point in mm_need_new_owner() allowing for NULL mm. Reported-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: NJiri Slaby <jirislaby@gmail.com> Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: NHugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 7月, 2008 3 次提交
-
-
由 Li Zefan 提交于
It's not small enough, and has 2 call sites. text data bss dec hex filename 12813 1676 4832 19321 4b79 cgroup.o.orig 12775 1676 4832 19283 4b53 cgroup.o Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
- just call free_cg_links() in allocate_cg_links() - the list will get initialized in allocate_cg_links(), so don't init it twice Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
There's a leak if copy_from_user() returns failure. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 7月, 2008 2 次提交
-
-
由 Al Viro 提交于
fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all. Several places in the tree needed direct include. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Adrian Bunk 提交于
cgroup_seqfile_release() can become static. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Acked-by: NPaul Menage <menage@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 7月, 2008 9 次提交
-
-
由 Serge E. Hallyn 提交于
cgroup_clone creates a new cgroup with the pid of the task. This works correctly for unshare, but for clone cgroup_clone is called from copy_namespaces inside copy_process, which happens before the new pid is created. As a result, the new cgroup was created with current's pid. This patch: 1. Moves the call inside copy_process to after the new pid is created 2. Passes the struct pid into ns_cgroup_clone (as it is not yet attached to the task) 3. Passes a name from ns_cgroup_clone() into cgroup_clone() so as to keep cgroup_clone() itself simpler 4. Uses pid_vnr() to get the process id value, so that the pid used to name the new cgroup is always the pid as it would be known to the task which did the cloning or unsharing. I think that is the most intuitive thing to do. This way, task t1 does clone(CLONE_NEWPID) to get t2, which does clone(CLONE_NEWPID) to get t3, then the cgroup for t3 will be named for the pid by which t2 knows t3. (Thanks to Dan Smith for finding the main bug) Changelog: June 11: Incorporate Paul Menage's feedback: don't pass NULL to ns_cgroup_clone from unshare, and reduce patch size by using 'nodename' in cgroup_clone. June 10: Original version [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NSerge Hallyn <serge@us.ibm.com> Acked-by: NPaul Menage <menage@google.com> Tested-by: NDan Smith <danms@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
This patch changes attach_task_by_pid() to take a u64 rather than a string; as a result it can be called directly as a control groups write_u64 handler, and cgroup_common_file_write() can be removed. Signed-off-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
This patch moves the write handler for the cgroups notify_on_release file into a separate handler. This handler requires no cgroups locking since it relies on atomic bitops for synchronization. Signed-off-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
This patch contains cleanups suggested by reviewers for the recent write_string() patchset: - pair cgroup_lock_live_group() with cgroup_unlock() in cgroup.c for clarity, rather than directly unlocking cgroup_mutex. - make the return type of cgroup_lock_live_group() a bool - use a #define'd constant for the local buffer size in read/write functions Signed-off-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Acked-by: NSerge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
Adds cgroup_release_agent_write() and cgroup_release_agent_show() methods to handle writing/reading the path to a cgroup hierarchy's release agent. As a result, cgroup_common_file_read() is now unnecessary. As part of the change, a previously-tolerated race in cgroup_release_agent() is avoided by copying the current release_agent_path prior to calling call_usermode_helper(). Signed-off-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@in.ibm.com> Acked-by: NSerge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
This patch adds a write_string() method for cgroups control files. The semantics are that a buffer is copied from userspace to kernelspace and the handler function invoked on that buffer. The buffer is guaranteed to be nul-terminated, and no longer than max_write_len (defaulting to 64 bytes if unspecified). Later patches will convert existing raw file write handlers in control group subsystems to use this method. Signed-off-by: NPaul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: NBalbir Singh <balbir@in.ibm.com> Acked-by: NSerge Hallyn <serue@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
- need_forkexit_callback will be read only after system boot. - use_task_css_set_links will be read only after it's set. And these 2 variables are checked when a new process is forked. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
```----------------------- while() { list_entry(); ... } ``` ----------------------- is equivalent to following code. -------------------------- list_for_each_entry(){ ... } -------------------------- later can review easily more. this patch is just clean up. it doesn't have any behavor change. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
The function does not modify anything (except the temporary css template), so it's sufficient to hold read lock. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NPaul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 5月, 2008 1 次提交
-
-
由 Cedric Le Goater 提交于
This is a slight change in the namespace cgroup subsystem api. The change is that previously when cgroup_clone() was called (currently only from the unshare path in ns_proxy cgroup, you'd get a new group named "node_$pid" whereas now you'll get a group named after just your pid.) The only users who would notice it are those who are using the ns_proxy cgroup subsystem to auto-create cgroups when namespaces are unshared - something of an experimental feature, which I think really needs more complete container/namespace support in order to be useful. I suspect the only users are Cedric and Serge, or maybe a few others on containers@lists.linux-foundation.org. And in fact it would only be noticed by the users who make the assumption about how the name is generated, rather than getting it from the /proc/<pid>/cgroups file for the process in question. Whether the change is actually needed or not I'm fairly agnostic on, but I guess it is more elegant to just use the pid as the new group name rather than adding a fairly arbitrary "node_" prefix on the front. [menage@google.com: provided changelog] Signed-off-by: NCedric Le Goater <clg@fr.ibm.com> Cc: "Paul Menage" <menage@google.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2008 1 次提交
-
-
由 Miklos Szeredi 提交于
Add a new BDI capability flag: BDI_CAP_NO_ACCT_WB. If this flag is set, then don't update the per-bdi writeback stats from test_set_page_writeback() and test_clear_page_writeback(). Misc cleanups: - convert bdi_cap_writeback_dirty() and friends to static inline functions - create a flag that includes all three dirty/writeback related flags, since almst all users will want to have them toghether Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 4月, 2008 5 次提交
-
-
由 Balbir Singh 提交于
Remove the mem_cgroup member from mm_struct and instead adds an owner. This approach was suggested by Paul Menage. The advantage of this approach is that, once the mm->owner is known, using the subsystem id, the cgroup can be determined. It also allows several control groups that are virtually grouped by mm_struct, to exist independent of the memory controller i.e., without adding mem_cgroup's for each controller, to mm_struct. A new config option CONFIG_MM_OWNER is added and the memory resource controller selects this config option. This patch also adds cgroup callbacks to notify subsystems when mm->owner changes. The mm_cgroup_changed callback is called with the task_lock() of the new task held and is called just prior to changing the mm->owner. I am indebted to Paul Menage for the several reviews of this patchset and helping me make it lighter and simpler. This patch was tested on a powerpc box, it was compiled with both the MM_OWNER config turned on and off. After the thread group leader exits, it's moved to init_css_state by cgroup_exit(), thus all future charges from runnings threads would be redirected to the init_css_set's subsystem. Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Hirokazu Takahashi <taka@valinux.co.jp> Cc: David Rientjes <rientjes@google.com>, Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: NPekka Enberg <penberg@cs.helsinki.fi> Reviewed-by: NPaul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Serge E. Hallyn 提交于
Introduce a read_seq() helper in cftype, which uses seq_file to print out lists. Use it in the devices cgroup. Also split devices.allow into two files, so now devices.deny and devices.allow are the ones to use to manipulate the whitelist, while devices.list outputs the cgroup's current whitelist. Signed-off-by: NSerge E. Hallyn <serue@us.ibm.com> Acked-by: NPaul Menage <menage@google.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
Now we can run through the hash table instead of running through the linked-list. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
We are at system boot and there is only 1 cgroup group (i,e, init_css_set), so we don't need to run through the css_set linked list. Neither do we need to run through the task list, since no processes have been created yet. Also referring to a comment in cgroup.h: struct css_set { ... /* * Set of subsystem states, one for each subsystem. This array * is immutable after creation apart from the init_css_set * during subsystem registration (at boot time). */ struct cgroup_subsys_state *subsys[CGROUP_SUBSYS_COUNT]; } Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
When we attach a process to a different cgroup, the css_set linked-list will be run through to find a suitable existing css_set to use. This patch implements a hash table for better performance. The following benchmarks have been tested: For N in 1, 5, 10, 50, 100, 500, 1000, create N cgroups with one sleeping task in each, and then move an additional task through each cgroup in turn. Here is a test result: N Loop orig - Time(s) hash - Time(s) ---------------------------------------------- 1 10000 1.201231728 1.196311177 5 2000 1.065743872 1.040566424 10 1000 0.991054735 0.986876440 50 200 0.976554203 0.969608733 100 100 0.998504680 0.969218270 500 20 1.157347764 0.962602963 1000 10 1.619521852 1.085140172 Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Reviewed-by: NPaul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-