- 13 12月, 2013 3 次提交
-
-
由 Paul Moore 提交于
Previously selinux_skb_peerlbl_sid() would only check for labeled IPsec security labels on inbound packets, this patch enables it to check both inbound and outbound traffic for labeled IPsec security labels. Reported-by: NJanak Desai <Janak.Desai@gtri.gatech.edu> Cc: stable@vger.kernel.org Signed-off-by: NPaul Moore <pmoore@redhat.com>
-
由 Paul Moore 提交于
In selinux_ip_postroute() we perform access checks based on the packet's security label. For locally generated traffic we get the packet's security label from the associated socket; this works in all cases except for TCP SYN-ACK packets. In the case of SYN-ACK packet's the correct security label is stored in the connection's request_sock, not the server's socket. Unfortunately, at the point in time when selinux_ip_postroute() is called we can't query the request_sock directly, we need to recreate the label using the same logic that originally labeled the associated request_sock. See the inline comments for more explanation. Reported-by: NJanak Desai <Janak.Desai@gtri.gatech.edu> Tested-by: NJanak Desai <Janak.Desai@gtri.gatech.edu> Cc: stable@vger.kernel.org Signed-off-by: NPaul Moore <pmoore@redhat.com>
-
由 Paul Moore 提交于
In selinux_ip_output() we always label packets based on the parent socket. While this approach works in almost all cases, it doesn't work in the case of TCP SYN-ACK packets when the correct label is not the label of the parent socket, but rather the label of the larval socket represented by the request_sock struct. Unfortunately, since the request_sock isn't queued on the parent socket until *after* the SYN-ACK packet is sent, we can't lookup the request_sock to determine the correct label for the packet; at this point in time the best we can do is simply pass/NF_ACCEPT the packet. It must be said that simply passing the packet without any explicit labeling action, while far from ideal, is not terrible as the SYN-ACK packet will inherit any IP option based labeling from the initial connection request so the label *should* be correct and all our access controls remain in place so we shouldn't have to worry about information leaks. Reported-by: NJanak Desai <Janak.Desai@gtri.gatech.edu> Tested-by: NJanak Desai <Janak.Desai@gtri.gatech.edu> Cc: stable@vger.kernel.org Signed-off-by: NPaul Moore <pmoore@redhat.com>
-
- 05 12月, 2013 1 次提交
-
-
由 Geyslan G. Bem 提交于
Free 'ctx_str' when necessary. Signed-off-by: NGeyslan G. Bem <geyslan@gmail.com> Signed-off-by: NPaul Moore <pmoore@redhat.com>
-
- 16 10月, 2013 2 次提交
-
-
由 John Johansen 提交于
BugLink: http://bugs.launchpad.net/bugs/1235977 The profile introspection seq file has a locking bug when policy is viewed from a virtual root (task in a policy namespace), introspection from the real root is not affected. The test for root while (parent) { is correct for the real root, but incorrect for tasks in a policy namespace. This allows the task to walk backup the policy tree past its virtual root causing it to be unlocked before the virtual root should be in the p_stop fn. This results in the following lockdep back trace: [ 78.479744] [ BUG: bad unlock balance detected! ] [ 78.479792] 3.11.0-11-generic #17 Not tainted [ 78.479838] ------------------------------------- [ 78.479885] grep/2223 is trying to release lock (&ns->lock) at: [ 78.479952] [<ffffffff817bf3be>] mutex_unlock+0xe/0x10 [ 78.480002] but there are no more locks to release! [ 78.480037] [ 78.480037] other info that might help us debug this: [ 78.480037] 1 lock held by grep/2223: [ 78.480037] #0: (&p->lock){+.+.+.}, at: [<ffffffff812111bd>] seq_read+0x3d/0x3d0 [ 78.480037] [ 78.480037] stack backtrace: [ 78.480037] CPU: 0 PID: 2223 Comm: grep Not tainted 3.11.0-11-generic #17 [ 78.480037] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 78.480037] ffffffff817bf3be ffff880007763d60 ffffffff817b97ef ffff8800189d2190 [ 78.480037] ffff880007763d88 ffffffff810e1c6e ffff88001f044730 ffff8800189d2190 [ 78.480037] ffffffff817bf3be ffff880007763e00 ffffffff810e5bd6 0000000724fe56b7 [ 78.480037] Call Trace: [ 78.480037] [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10 [ 78.480037] [<ffffffff817b97ef>] dump_stack+0x54/0x74 [ 78.480037] [<ffffffff810e1c6e>] print_unlock_imbalance_bug+0xee/0x100 [ 78.480037] [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10 [ 78.480037] [<ffffffff810e5bd6>] lock_release_non_nested+0x226/0x300 [ 78.480037] [<ffffffff817bf2fe>] ? __mutex_unlock_slowpath+0xce/0x180 [ 78.480037] [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10 [ 78.480037] [<ffffffff810e5d5c>] lock_release+0xac/0x310 [ 78.480037] [<ffffffff817bf2b3>] __mutex_unlock_slowpath+0x83/0x180 [ 78.480037] [<ffffffff817bf3be>] mutex_unlock+0xe/0x10 [ 78.480037] [<ffffffff81376c91>] p_stop+0x51/0x90 [ 78.480037] [<ffffffff81211408>] seq_read+0x288/0x3d0 [ 78.480037] [<ffffffff811e9d9e>] vfs_read+0x9e/0x170 [ 78.480037] [<ffffffff811ea8cc>] SyS_read+0x4c/0xa0 [ 78.480037] [<ffffffff817ccc9d>] system_call_fastpath+0x1a/0x1f Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
由 John Johansen 提交于
BugLink: http://bugs.launchpad.net/bugs/1235523 This fixes the following kmemleak trace: unreferenced object 0xffff8801e8c35680 (size 32): comm "apparmor_parser", pid 691, jiffies 4294895667 (age 13230.876s) hex dump (first 32 bytes): e0 d3 4e b5 ac 6d f4 ed 3f cb ee 48 1c fd 40 cf ..N..m..?..H..@. 5b cc e9 93 00 00 00 00 00 00 00 00 00 00 00 00 [............... backtrace: [<ffffffff817a97ee>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811ca9f3>] __kmalloc+0x103/0x290 [<ffffffff8138acbc>] aa_calc_profile_hash+0x6c/0x150 [<ffffffff8138074d>] aa_unpack+0x39d/0xd50 [<ffffffff8137eced>] aa_replace_profiles+0x3d/0xd80 [<ffffffff81376937>] profile_replace+0x37/0x50 [<ffffffff811e9f2d>] vfs_write+0xbd/0x1e0 [<ffffffff811ea96c>] SyS_write+0x4c/0xa0 [<ffffffff817ccb1d>] system_call_fastpath+0x1a/0x1f [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
- 05 10月, 2013 3 次提交
-
-
由 Linus Torvalds 提交于
Now avc_audit() has no more users with that parameter. Remove it. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
.. so get rid of it. The only indirect users were all the avc_has_perm() callers which just expanded to have a zero flags argument. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Every single user passes in '0'. I think we had non-zero users back in some stone age when selinux_inode_permission() was implemented in terms of inode_has_perm(), but that complicated case got split up into a totally separate code-path so that we could optimize the much simpler special cases. See commit 2e334057 ("SELinux: delay initialization of audit data in selinux_inode_permission") for example. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 9月, 2013 2 次提交
-
-
由 John Johansen 提交于
The recent 3.12 pull request for apparmor was missing a couple rcu _protected access modifiers. Resulting in the follow suspicious RCU usage [ 29.804534] [ INFO: suspicious RCU usage. ] [ 29.804539] 3.11.0+ #5 Not tainted [ 29.804541] ------------------------------- [ 29.804545] security/apparmor/include/policy.h:363 suspicious rcu_dereference_check() usage! [ 29.804548] [ 29.804548] other info that might help us debug this: [ 29.804548] [ 29.804553] [ 29.804553] rcu_scheduler_active = 1, debug_locks = 1 [ 29.804558] 2 locks held by apparmor_parser/1268: [ 29.804560] #0: (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29 [ 29.804576] #1: (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c [ 29.804589] [ 29.804589] stack backtrace: [ 29.804595] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5 [ 29.804599] Hardware name: ASUSTeK Computer Inc. UL50VT /UL50VT , BIOS 217 03/01/2010 [ 29.804602] 0000000000000000 ffff8800b95a1d90 ffffffff8144eb9b ffff8800b94db540 [ 29.804611] ffff8800b95a1dc0 ffffffff81087439 ffff880138cc3a18 ffff880138cc3a18 [ 29.804619] ffff8800b9464a90 ffff880138cc3a38 ffff8800b95a1df0 ffffffff811f5084 [ 29.804628] Call Trace: [ 29.804636] [<ffffffff8144eb9b>] dump_stack+0x4e/0x82 [ 29.804642] [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105 [ 29.804649] [<ffffffff811f5084>] __aa_update_replacedby+0x53/0x7f [ 29.804655] [<ffffffff811f5408>] __replace_profile+0x11f/0x1ed [ 29.804661] [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c [ 29.804668] [<ffffffff811f16d4>] profile_replace+0x35/0x4c [ 29.804674] [<ffffffff81120fa3>] vfs_write+0xad/0x113 [ 29.804680] [<ffffffff81121609>] SyS_write+0x44/0x7a [ 29.804687] [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b [ 29.804691] [ 29.804694] =============================== [ 29.804697] [ INFO: suspicious RCU usage. ] [ 29.804700] 3.11.0+ #5 Not tainted [ 29.804703] ------------------------------- [ 29.804706] security/apparmor/policy.c:566 suspicious rcu_dereference_check() usage! [ 29.804709] [ 29.804709] other info that might help us debug this: [ 29.804709] [ 29.804714] [ 29.804714] rcu_scheduler_active = 1, debug_locks = 1 [ 29.804718] 2 locks held by apparmor_parser/1268: [ 29.804721] #0: (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29 [ 29.804733] #1: (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c [ 29.804744] [ 29.804744] stack backtrace: [ 29.804750] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5 [ 29.804753] Hardware name: ASUSTeK Computer Inc. UL50VT /UL50VT , BIOS 217 03/01/2010 [ 29.804756] 0000000000000000 ffff8800b95a1d80 ffffffff8144eb9b ffff8800b94db540 [ 29.804764] ffff8800b95a1db0 ffffffff81087439 ffff8800b95b02b0 0000000000000000 [ 29.804772] ffff8800b9efba08 ffff880138cc3a38 ffff8800b95a1dd0 ffffffff811f4f94 [ 29.804779] Call Trace: [ 29.804786] [<ffffffff8144eb9b>] dump_stack+0x4e/0x82 [ 29.804791] [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105 [ 29.804798] [<ffffffff811f4f94>] aa_free_replacedby_kref+0x4d/0x62 [ 29.804804] [<ffffffff811f4f47>] ? aa_put_namespace+0x17/0x17 [ 29.804810] [<ffffffff811f4f0b>] kref_put+0x36/0x40 [ 29.804816] [<ffffffff811f5423>] __replace_profile+0x13a/0x1ed [ 29.804822] [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c [ 29.804829] [<ffffffff811f16d4>] profile_replace+0x35/0x4c [ 29.804835] [<ffffffff81120fa3>] vfs_write+0xad/0x113 [ 29.804840] [<ffffffff81121609>] SyS_write+0x44/0x7a [ 29.804847] [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b Reported-by: miles.lane@gmail.com CC: paulmck@linux.vnet.ibm.com Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
由 Tyler Hicks 提交于
Use the shash interface, rather than the hash interface, when hashing AppArmor profiles. The shash interface does not use scatterlists and it is a better fit for what AppArmor needs. This fixes a kernel paging BUG when aa_calc_profile_hash() is passed a buffer from vmalloc(). The hash interface requires callers to handle vmalloc() buffers differently than what AppArmor was doing. Due to vmalloc() memory not being physically contiguous, each individual page behind the buffer must be assigned to a scatterlist with sg_set_page() and then the scatterlist passed to crypto_hash_update(). The shash interface does not have that limitation and allows vmalloc() and kmalloc() buffers to be handled in the same manner. BugLink: https://launchpad.net/bugs/1216294/ BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=62261Signed-off-by: NTyler Hicks <tyhicks@canonical.com> Acked-by: NSeth Arnold <seth.arnold@canonical.com> Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
- 27 9月, 2013 2 次提交
-
-
由 Paul Moore 提交于
The SELinux/NetLabel glue code has a locking bug that affects systems with NetLabel enabled, see the kernel error message below. This patch corrects this problem by converting the bottom half socket lock to a more conventional, and correct for this call-path, lock_sock() call. =============================== [ INFO: suspicious RCU usage. ] 3.11.0-rc3+ #19 Not tainted ------------------------------- net/ipv4/cipso_ipv4.c:1928 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ping/731: #0: (slock-AF_INET/1){+.-...}, at: [...] selinux_netlbl_socket_connect #1: (rcu_read_lock){.+.+..}, at: [<...>] netlbl_conn_setattr stack backtrace: CPU: 1 PID: 731 Comm: ping Not tainted 3.11.0-rc3+ #19 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000001 ffff88006f659d28 ffffffff81726b6a ffff88003732c500 ffff88006f659d58 ffffffff810e4457 ffff88006b845a00 0000000000000000 000000000000000c ffff880075aa2f50 ffff88006f659d90 ffffffff8169bec7 Call Trace: [<ffffffff81726b6a>] dump_stack+0x54/0x74 [<ffffffff810e4457>] lockdep_rcu_suspicious+0xe7/0x120 [<ffffffff8169bec7>] cipso_v4_sock_setattr+0x187/0x1a0 [<ffffffff8170f317>] netlbl_conn_setattr+0x187/0x190 [<ffffffff8170f195>] ? netlbl_conn_setattr+0x5/0x190 [<ffffffff8131ac9e>] selinux_netlbl_socket_connect+0xae/0xc0 [<ffffffff81303025>] selinux_socket_connect+0x135/0x170 [<ffffffff8119d127>] ? might_fault+0x57/0xb0 [<ffffffff812fb146>] security_socket_connect+0x16/0x20 [<ffffffff815d3ad3>] SYSC_connect+0x73/0x130 [<ffffffff81739a85>] ? sysret_check+0x22/0x5d [<ffffffff810e5e2d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff81373d4e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff815d52be>] SyS_connect+0xe/0x10 [<ffffffff81739a59>] system_call_fastpath+0x16/0x1b Cc: stable@vger.kernel.org Signed-off-by: NPaul Moore <pmoore@redhat.com>
-
由 Duan Jiong 提交于
Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 31 8月, 2013 2 次提交
-
-
由 Serge Hallyn 提交于
We allow task A to change B's nice level if it has a supserset of B's privileges, or of it has CAP_SYS_NICE. Also allow it if A has CAP_SYS_NICE with respect to B - meaning it is root in the same namespace, or it created B's namespace. Signed-off-by: NSerge Hallyn <serge.hallyn@canonical.com> Reviewed-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
-
由 Eric W. Biederman 提交于
As the capabilites and capability bounding set are per user namespace properties it is safe to allow changing them with just CAP_SETPCAP permission in the user namespace. Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Tested-by: NRichard Weinberger <richard@nod.at> Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 29 8月, 2013 2 次提交
-
-
由 Eric Paris 提交于
This reverts commit 308ab70c. It breaks my FC6 test box. /dev/pts is not mounted. dmesg says SELinux: mount invalid. Same superblock, different security settings for (dev devpts, type devpts) Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NEric Paris <eparis@redhat.com>
-
由 Anand Avati 提交于
Not considering sub filesystem has the following limitation. Support for SELinux in FUSE is dependent on the particular userspace filesystem, which is identified by the subtype. For e.g, GlusterFS, a FUSE based filesystem supports SELinux (by mounting and processing FUSE requests in different threads, avoiding the mount time deadlock), whereas other FUSE based filesystems (identified by a different subtype) have the mount time deadlock. By considering the subtype of the filesytem in the SELinux policies, allows us to specify a filesystem subtype, in the following way: fs_use_xattr fuse.glusterfs gen_context(system_u:object_r:fs_t,s0); This way not all FUSE filesystems are put in the same bucket and subjected to the limitations of the other subtypes. Signed-off-by: NAnand Avati <avati@redhat.com> Signed-off-by: NEric Paris <eparis@redhat.com>
-
- 20 8月, 2013 1 次提交
-
-
由 Steven Rostedt 提交于
The apparmor module parameters for param_ops_aabool and param_ops_aalockpolicy are both based off of the param_ops_bool, and can handle a NULL value passed in as val. Have it enable the new KERNEL_PARAM_FL_NOARGS flag to allow the parameters to be set without having to state "=y" or "=1". Cc: John Johansen <john.johansen@canonical.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 15 8月, 2013 15 次提交
-
-
由 John Johansen 提交于
Provide userspace the ability to introspect a sha1 hash value for each profile currently loaded. Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Acked-by: NSeth Arnold <seth.arnold@canonical.com>
-
由 John Johansen 提交于
Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Acked-by: NSeth Arnold <seth.arnold@canonical.com>
-
由 John Johansen 提交于
Add the dynamic namespace relative profiles file to the interace, to allow introspection of loaded profiles and their modes. Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Acked-by: NKees Cook <kees@ubuntu.com>
-
由 John Johansen 提交于
Add the ability to take in and report a human readable profile attachment string for profiles so that attachment specifications can be easily inspected. Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Acked-by: NSeth Arnold <seth.arnold@canonical.com>
-
由 John Johansen 提交于
Add basic interface files to access namespace and profile information. The interface files are created when a profile is loaded and removed when the profile or namespace is removed. Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
-
由 John Johansen 提交于
Allow emulating the default profile behavior from boot, by allowing loading of a profile in the unconfined state into a new NS. Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Acked-by: NSeth Arnold <seth.arnold@canonical.com>
-
由 John Johansen 提交于
Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
-
由 John Johansen 提交于
namespaces now completely use the unconfined profile to track the refcount and rcu freeing cycle. So rework the code to simplify (track everything through the profile path right up to the end), and move the rcu_head from policy base to profile as the namespace no longer needs it. Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Acked-by: NSeth Arnold <seth.arnold@canonical.com>
-
由 John Johansen 提交于
ns->unconfined is being used read side without locking, nor rcu but is being updated when a namespace is removed. This works for the root ns which is never removed but has a race window and can cause failures when children namespaces are removed. Also ns and ns->unconfined have a circular refcounting dependency that is problematic and must be broken. Currently this is done incorrectly when the namespace is destroyed. Fix this by forward referencing unconfined via the replacedby infrastructure instead of directly updating the ns->unconfined pointer. Remove the circular refcount dependency by making the ns and its unconfined profile share the same refcount. Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Acked-by: NSeth Arnold <seth.arnold@canonical.com>
-
由 John Johansen 提交于
remove the use of replaced by chaining and move to profile invalidation and lookup to handle task replacement. Replacement chaining can result in large chains of profiles being pinned in memory when one profile in the chain is use. With implicit labeling this will be even more of a problem, so move to a direct lookup method. Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
-
由 John Johansen 提交于
Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
-
由 John Johansen 提交于
previously profiles had to be loaded one at a time, which could result in cases where a replacement of a set would partially succeed, and then fail resulting in inconsistent policy. Allow multiple profiles to replaced "atomically" so that the replacement either succeeds or fails for the entire set of profiles. Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
-
由 John Johansen 提交于
Add a policy directory to features to contain features that can affect policy compilation but do not affect mediation. Eg of such features would be types of dfa compression supported, etc. Signed-off-by: NJohn Johansen <john.johansen@canonical.com> Acked-by: NKees Cook <kees@ubuntu.com>
-
由 John Johansen 提交于
Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
-
由 Tetsuo Handa 提交于
This is a follow-up to commit b5b3ee6c "apparmor: no need to delay vfree()". Since vmalloc() will do "size = PAGE_ALIGN(size);", we don't need to check for "size >= sizeof(struct work_struct)". Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
-
- 13 8月, 2013 1 次提交
-
-
由 Rafal Krypa 提交于
Smack interface for loading rules has always parsed only single rule from data written to it. This requires user program to call one write() per each rule it wants to load. This change makes it possible to write multiple rules, separated by new line character. Smack will load at most PAGE_SIZE-1 characters and properly return number of processed bytes. In case when user buffer is larger, it will be additionally truncated. All characters after last \n will not get parsed to avoid partial rule near input buffer boundary. Signed-off-by: NRafal Krypa <r.krypa@samsung.com>
-
- 09 8月, 2013 6 次提交
-
-
由 Tejun Heo 提交于
Previously, all css descendant iterators didn't include the origin (root of subtree) css in the iteration. The reasons were maintaining consistency with css_for_each_child() and that at the time of introduction more use cases needed skipping the origin anyway; however, given that css_is_descendant() considers self to be a descendant, omitting the origin css has become more confusing and looking at the accumulated use cases rather clearly indicates that including origin would result in simpler code overall. While this is a change which can easily lead to subtle bugs, cgroup API including the iterators has recently gone through major restructuring and no out-of-tree changes will be applicable without adjustments making this a relatively acceptable opportunity for this type of change. The conversions are mostly straight-forward. If the iteration block had explicit origin handling before or after, it's moved inside the iteration. If not, if (pos == origin) continue; is added. Some conversions add extra reference get/put around origin handling by consolidating origin handling and the rest. While the extra ref operations aren't strictly necessary, this shouldn't cause any noticeable difference. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Acked-by: NAristeu Rozanski <aris@redhat.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com>
-
由 Tejun Heo 提交于
cgroup is currently in the process of transitioning to using css (cgroup_subsys_state) as the primary handle instead of cgroup in subsystem API. For hierarchy iterators, this is beneficial because * In most cases, css is the only thing subsystems care about anyway. * On the planned unified hierarchy, iterations for different subsystems will need to skip over different subtrees of the hierarchy depending on which subsystems are enabled on each cgroup. Passing around css makes it unnecessary to explicitly specify the subsystem in question as css is intersection between cgroup and subsystem * For the planned unified hierarchy, css's would need to be created and destroyed dynamically independent from cgroup hierarchy. Having cgroup core manage css iteration makes enforcing deref rules a lot easier. Most subsystem conversions are straight-forward. Noteworthy changes are * blkio: cgroup_to_blkcg() is no longer used. Removed. * freezer: cgroup_freezer() is no longer used. Removed. * devices: cgroup_to_devcgroup() is no longer used. Removed. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NVivek Goyal <vgoyal@redhat.com> Acked-by: NAristeu Rozanski <aris@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Jens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
cgroup is currently in the process of transitioning to using struct cgroup_subsys_state * as the primary handle instead of struct cgroup. Please see the previous commit which converts the subsystem methods for rationale. This patch converts all cftype file operations to take @css instead of @cgroup. cftypes for the cgroup core files don't have their subsytem pointer set. These will automatically use the dummy_css added by the previous patch and can be converted the same way. Most subsystem conversions are straight forwards but there are some interesting ones. * freezer: update_if_frozen() is also converted to take @css instead of @cgroup for consistency. This will make the code look simpler too once iterators are converted to use css. * memory/vmpressure: mem_cgroup_from_css() needs to be exported to vmpressure while mem_cgroup_from_cont() can be made static. Updated accordingly. * cpu: cgroup_tg() doesn't have any user left. Removed. * cpuacct: cgroup_ca() doesn't have any user left. Removed. * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left. Removed. * net_cls: cgrp_cls_state() doesn't have any user left. Removed. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NVivek Goyal <vgoyal@redhat.com> Acked-by: NAristeu Rozanski <aris@redhat.com> Acked-by: NDaniel Wagner <daniel.wagner@bmw-carit.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Steven Rostedt <rostedt@goodmis.org>
-
由 Tejun Heo 提交于
cgroup is currently in the process of transitioning to using struct cgroup_subsys_state * as the primary handle instead of struct cgroup * in subsystem implementations for the following reasons. * With unified hierarchy, subsystems will be dynamically bound and unbound from cgroups and thus css's (cgroup_subsys_state) may be created and destroyed dynamically over the lifetime of a cgroup, which is different from the current state where all css's are allocated and destroyed together with the associated cgroup. This in turn means that cgroup_css() should be synchronized and may return NULL, making it more cumbersome to use. * Differing levels of per-subsystem granularity in the unified hierarchy means that the task and descendant iterators should behave differently depending on the specific subsystem the iteration is being performed for. * In majority of the cases, subsystems only care about its part in the cgroup hierarchy - ie. the hierarchy of css's. Subsystem methods often obtain the matching css pointer from the cgroup and don't bother with the cgroup pointer itself. Passing around css fits much better. This patch converts all cgroup_subsys methods to take @css instead of @cgroup. The conversions are mostly straight-forward. A few noteworthy changes are * ->css_alloc() now takes css of the parent cgroup rather than the pointer to the new cgroup as the css for the new cgroup doesn't exist yet. Knowing the parent css is enough for all the existing subsystems. * In kernel/cgroup.c::offline_css(), unnecessary open coded css dereference is replaced with local variable access. This patch shouldn't cause any behavior differences. v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced with local variable @css as suggested by Li Zefan. Rebased on top of new for-3.12 which includes for-3.11-fixes so that ->css_free() invocation added by da0a12ca ("cgroup: fix a leak when percpu_ref_init() fails") is converted too. Suggested by Li Zefan. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NVivek Goyal <vgoyal@redhat.com> Acked-by: NAristeu Rozanski <aris@redhat.com> Acked-by: NDaniel Wagner <daniel.wagner@bmw-carit.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Steven Rostedt <rostedt@goodmis.org>
-
由 Tejun Heo 提交于
Currently, controllers have to explicitly follow the cgroup hierarchy to find the parent of a given css. cgroup is moving towards using cgroup_subsys_state as the main controller interface construct, so let's provide a way to climb the hierarchy using just csses. This patch implements css_parent() which, given a css, returns its parent. The function is guarnateed to valid non-NULL parent css as long as the target css is not at the top of the hierarchy. freezer, cpuset, cpu, cpuacct, hugetlb, memory, net_cls and devices are converted to use css_parent() instead of accessing cgroup->parent directly. * __parent_ca() is dropped from cpuacct and its usage is replaced with parent_ca(). The only difference between the two was NULL test on cgroup->parent which is now embedded in css_parent() making the distinction moot. Note that eventually a css->parent field will be added to css and the NULL check in css_parent() will go away. This patch shouldn't cause any behavior differences. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-
由 Tejun Heo 提交于
css (cgroup_subsys_state) is usually embedded in a subsys specific data structure. Subsystems either use container_of() directly to cast from css to such data structure or has an accessor function wrapping such cast. As cgroup as whole is moving towards using css as the main interface handle, add and update such accessors to ease dealing with css's. All accessors explicitly handle NULL input and return NULL in those cases. While this looks like an extra branch in the code, as all controllers specific data structures have css as the first field, the casting doesn't involve any offsetting and the compiler can trivially optimize out the branch. * blkio, freezer, cpuset, cpu, cpuacct and net_cls didn't have such accessor. Added. * memory, hugetlb and devices already had one but didn't explicitly handle NULL input. Updated. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com>
-