1. 16 10月, 2013 2 次提交
    • J
      apparmor: fix bad lock balance when introspecting policy · ed2c7da3
      John Johansen 提交于
      BugLink: http://bugs.launchpad.net/bugs/1235977
      
      The profile introspection seq file has a locking bug when policy is viewed
      from a virtual root (task in a policy namespace), introspection from the
      real root is not affected.
      
      The test for root
          while (parent) {
      is correct for the real root, but incorrect for tasks in a policy namespace.
      This allows the task to walk backup the policy tree past its virtual root
      causing it to be unlocked before the virtual root should be in the p_stop
      fn.
      
      This results in the following lockdep back trace:
      [   78.479744] [ BUG: bad unlock balance detected! ]
      [   78.479792] 3.11.0-11-generic #17 Not tainted
      [   78.479838] -------------------------------------
      [   78.479885] grep/2223 is trying to release lock (&ns->lock) at:
      [   78.479952] [<ffffffff817bf3be>] mutex_unlock+0xe/0x10
      [   78.480002] but there are no more locks to release!
      [   78.480037]
      [   78.480037] other info that might help us debug this:
      [   78.480037] 1 lock held by grep/2223:
      [   78.480037]  #0:  (&p->lock){+.+.+.}, at: [<ffffffff812111bd>] seq_read+0x3d/0x3d0
      [   78.480037]
      [   78.480037] stack backtrace:
      [   78.480037] CPU: 0 PID: 2223 Comm: grep Not tainted 3.11.0-11-generic #17
      [   78.480037] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [   78.480037]  ffffffff817bf3be ffff880007763d60 ffffffff817b97ef ffff8800189d2190
      [   78.480037]  ffff880007763d88 ffffffff810e1c6e ffff88001f044730 ffff8800189d2190
      [   78.480037]  ffffffff817bf3be ffff880007763e00 ffffffff810e5bd6 0000000724fe56b7
      [   78.480037] Call Trace:
      [   78.480037]  [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10
      [   78.480037]  [<ffffffff817b97ef>] dump_stack+0x54/0x74
      [   78.480037]  [<ffffffff810e1c6e>] print_unlock_imbalance_bug+0xee/0x100
      [   78.480037]  [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10
      [   78.480037]  [<ffffffff810e5bd6>] lock_release_non_nested+0x226/0x300
      [   78.480037]  [<ffffffff817bf2fe>] ? __mutex_unlock_slowpath+0xce/0x180
      [   78.480037]  [<ffffffff817bf3be>] ? mutex_unlock+0xe/0x10
      [   78.480037]  [<ffffffff810e5d5c>] lock_release+0xac/0x310
      [   78.480037]  [<ffffffff817bf2b3>] __mutex_unlock_slowpath+0x83/0x180
      [   78.480037]  [<ffffffff817bf3be>] mutex_unlock+0xe/0x10
      [   78.480037]  [<ffffffff81376c91>] p_stop+0x51/0x90
      [   78.480037]  [<ffffffff81211408>] seq_read+0x288/0x3d0
      [   78.480037]  [<ffffffff811e9d9e>] vfs_read+0x9e/0x170
      [   78.480037]  [<ffffffff811ea8cc>] SyS_read+0x4c/0xa0
      [   78.480037]  [<ffffffff817ccc9d>] system_call_fastpath+0x1a/0x1f
      Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      ed2c7da3
    • J
      apparmor: fix memleak of the profile hash · 5cb3e91e
      John Johansen 提交于
      BugLink: http://bugs.launchpad.net/bugs/1235523
      
      This fixes the following kmemleak trace:
      unreferenced object 0xffff8801e8c35680 (size 32):
        comm "apparmor_parser", pid 691, jiffies 4294895667 (age 13230.876s)
        hex dump (first 32 bytes):
          e0 d3 4e b5 ac 6d f4 ed 3f cb ee 48 1c fd 40 cf  ..N..m..?..H..@.
          5b cc e9 93 00 00 00 00 00 00 00 00 00 00 00 00  [...............
        backtrace:
          [<ffffffff817a97ee>] kmemleak_alloc+0x4e/0xb0
          [<ffffffff811ca9f3>] __kmalloc+0x103/0x290
          [<ffffffff8138acbc>] aa_calc_profile_hash+0x6c/0x150
          [<ffffffff8138074d>] aa_unpack+0x39d/0xd50
          [<ffffffff8137eced>] aa_replace_profiles+0x3d/0xd80
          [<ffffffff81376937>] profile_replace+0x37/0x50
          [<ffffffff811e9f2d>] vfs_write+0xbd/0x1e0
          [<ffffffff811ea96c>] SyS_write+0x4c/0xa0
          [<ffffffff817ccb1d>] system_call_fastpath+0x1a/0x1f
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      5cb3e91e
  2. 05 10月, 2013 3 次提交
  3. 30 9月, 2013 2 次提交
    • J
      apparmor: fix suspicious RCU usage warning in policy.c/policy.h · 4cd4fc77
      John Johansen 提交于
      The recent 3.12 pull request for apparmor was missing a couple rcu _protected
      access modifiers. Resulting in the follow suspicious RCU usage
      
       [   29.804534] [ INFO: suspicious RCU usage. ]
       [   29.804539] 3.11.0+ #5 Not tainted
       [   29.804541] -------------------------------
       [   29.804545] security/apparmor/include/policy.h:363 suspicious rcu_dereference_check() usage!
       [   29.804548]
       [   29.804548] other info that might help us debug this:
       [   29.804548]
       [   29.804553]
       [   29.804553] rcu_scheduler_active = 1, debug_locks = 1
       [   29.804558] 2 locks held by apparmor_parser/1268:
       [   29.804560]  #0:  (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29
       [   29.804576]  #1:  (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c
       [   29.804589]
       [   29.804589] stack backtrace:
       [   29.804595] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5
       [   29.804599] Hardware name: ASUSTeK Computer Inc.         UL50VT          /UL50VT    , BIOS 217     03/01/2010
       [   29.804602]  0000000000000000 ffff8800b95a1d90 ffffffff8144eb9b ffff8800b94db540
       [   29.804611]  ffff8800b95a1dc0 ffffffff81087439 ffff880138cc3a18 ffff880138cc3a18
       [   29.804619]  ffff8800b9464a90 ffff880138cc3a38 ffff8800b95a1df0 ffffffff811f5084
       [   29.804628] Call Trace:
       [   29.804636]  [<ffffffff8144eb9b>] dump_stack+0x4e/0x82
       [   29.804642]  [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105
       [   29.804649]  [<ffffffff811f5084>] __aa_update_replacedby+0x53/0x7f
       [   29.804655]  [<ffffffff811f5408>] __replace_profile+0x11f/0x1ed
       [   29.804661]  [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c
       [   29.804668]  [<ffffffff811f16d4>] profile_replace+0x35/0x4c
       [   29.804674]  [<ffffffff81120fa3>] vfs_write+0xad/0x113
       [   29.804680]  [<ffffffff81121609>] SyS_write+0x44/0x7a
       [   29.804687]  [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b
       [   29.804691]
       [   29.804694] ===============================
       [   29.804697] [ INFO: suspicious RCU usage. ]
       [   29.804700] 3.11.0+ #5 Not tainted
       [   29.804703] -------------------------------
       [   29.804706] security/apparmor/policy.c:566 suspicious rcu_dereference_check() usage!
       [   29.804709]
       [   29.804709] other info that might help us debug this:
       [   29.804709]
       [   29.804714]
       [   29.804714] rcu_scheduler_active = 1, debug_locks = 1
       [   29.804718] 2 locks held by apparmor_parser/1268:
       [   29.804721]  #0:  (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29
       [   29.804733]  #1:  (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c
       [   29.804744]
       [   29.804744] stack backtrace:
       [   29.804750] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5
       [   29.804753] Hardware name: ASUSTeK Computer Inc.         UL50VT          /UL50VT    , BIOS 217     03/01/2010
       [   29.804756]  0000000000000000 ffff8800b95a1d80 ffffffff8144eb9b ffff8800b94db540
       [   29.804764]  ffff8800b95a1db0 ffffffff81087439 ffff8800b95b02b0 0000000000000000
       [   29.804772]  ffff8800b9efba08 ffff880138cc3a38 ffff8800b95a1dd0 ffffffff811f4f94
       [   29.804779] Call Trace:
       [   29.804786]  [<ffffffff8144eb9b>] dump_stack+0x4e/0x82
       [   29.804791]  [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105
       [   29.804798]  [<ffffffff811f4f94>] aa_free_replacedby_kref+0x4d/0x62
       [   29.804804]  [<ffffffff811f4f47>] ? aa_put_namespace+0x17/0x17
       [   29.804810]  [<ffffffff811f4f0b>] kref_put+0x36/0x40
       [   29.804816]  [<ffffffff811f5423>] __replace_profile+0x13a/0x1ed
       [   29.804822]  [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c
       [   29.804829]  [<ffffffff811f16d4>] profile_replace+0x35/0x4c
       [   29.804835]  [<ffffffff81120fa3>] vfs_write+0xad/0x113
       [   29.804840]  [<ffffffff81121609>] SyS_write+0x44/0x7a
       [   29.804847]  [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b
      
      Reported-by: miles.lane@gmail.com
      CC: paulmck@linux.vnet.ibm.com
      Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      4cd4fc77
    • T
      apparmor: Use shash crypto API interface for profile hashes · 71ac7f62
      Tyler Hicks 提交于
      Use the shash interface, rather than the hash interface, when hashing
      AppArmor profiles. The shash interface does not use scatterlists and it
      is a better fit for what AppArmor needs.
      
      This fixes a kernel paging BUG when aa_calc_profile_hash() is passed a
      buffer from vmalloc(). The hash interface requires callers to handle
      vmalloc() buffers differently than what AppArmor was doing. Due to
      vmalloc() memory not being physically contiguous, each individual page
      behind the buffer must be assigned to a scatterlist with sg_set_page()
      and then the scatterlist passed to crypto_hash_update().
      
      The shash interface does not have that limitation and allows vmalloc()
      and kmalloc() buffers to be handled in the same manner.
      
      BugLink: https://launchpad.net/bugs/1216294/
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=62261Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
      Acked-by: NSeth Arnold <seth.arnold@canonical.com>
      Signed-off-by: NJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      71ac7f62
  4. 27 9月, 2013 2 次提交
    • P
      selinux: correct locking in selinux_netlbl_socket_connect) · 42d64e1a
      Paul Moore 提交于
      The SELinux/NetLabel glue code has a locking bug that affects systems
      with NetLabel enabled, see the kernel error message below.  This patch
      corrects this problem by converting the bottom half socket lock to a
      more conventional, and correct for this call-path, lock_sock() call.
      
       ===============================
       [ INFO: suspicious RCU usage. ]
       3.11.0-rc3+ #19 Not tainted
       -------------------------------
       net/ipv4/cipso_ipv4.c:1928 suspicious rcu_dereference_protected() usage!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 1, debug_locks = 0
       2 locks held by ping/731:
        #0:  (slock-AF_INET/1){+.-...}, at: [...] selinux_netlbl_socket_connect
        #1:  (rcu_read_lock){.+.+..}, at: [<...>] netlbl_conn_setattr
      
       stack backtrace:
       CPU: 1 PID: 731 Comm: ping Not tainted 3.11.0-rc3+ #19
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        0000000000000001 ffff88006f659d28 ffffffff81726b6a ffff88003732c500
        ffff88006f659d58 ffffffff810e4457 ffff88006b845a00 0000000000000000
        000000000000000c ffff880075aa2f50 ffff88006f659d90 ffffffff8169bec7
       Call Trace:
        [<ffffffff81726b6a>] dump_stack+0x54/0x74
        [<ffffffff810e4457>] lockdep_rcu_suspicious+0xe7/0x120
        [<ffffffff8169bec7>] cipso_v4_sock_setattr+0x187/0x1a0
        [<ffffffff8170f317>] netlbl_conn_setattr+0x187/0x190
        [<ffffffff8170f195>] ? netlbl_conn_setattr+0x5/0x190
        [<ffffffff8131ac9e>] selinux_netlbl_socket_connect+0xae/0xc0
        [<ffffffff81303025>] selinux_socket_connect+0x135/0x170
        [<ffffffff8119d127>] ? might_fault+0x57/0xb0
        [<ffffffff812fb146>] security_socket_connect+0x16/0x20
        [<ffffffff815d3ad3>] SYSC_connect+0x73/0x130
        [<ffffffff81739a85>] ? sysret_check+0x22/0x5d
        [<ffffffff810e5e2d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
        [<ffffffff81373d4e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
        [<ffffffff815d52be>] SyS_connect+0xe/0x10
        [<ffffffff81739a59>] system_call_fastpath+0x16/0x1b
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      42d64e1a
    • D
      7d1db4b2
  5. 31 8月, 2013 2 次提交
  6. 29 8月, 2013 2 次提交
    • E
      Revert "SELinux: do not handle seclabel as a special flag" · 0b4bdb35
      Eric Paris 提交于
      This reverts commit 308ab70c.
      
      It breaks my FC6 test box.  /dev/pts is not mounted.  dmesg says
      
      SELinux: mount invalid.  Same superblock, different security settings
      for (dev devpts, type devpts)
      
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      0b4bdb35
    • A
      selinux: consider filesystem subtype in policies · 102aefdd
      Anand Avati 提交于
      Not considering sub filesystem has the following limitation. Support
      for SELinux in FUSE is dependent on the particular userspace
      filesystem, which is identified by the subtype. For e.g, GlusterFS,
      a FUSE based filesystem supports SELinux (by mounting and processing
      FUSE requests in different threads, avoiding the mount time
      deadlock), whereas other FUSE based filesystems (identified by a
      different subtype) have the mount time deadlock.
      
      By considering the subtype of the filesytem in the SELinux policies,
      allows us to specify a filesystem subtype, in the following way:
      
      fs_use_xattr fuse.glusterfs gen_context(system_u:object_r:fs_t,s0);
      
      This way not all FUSE filesystems are put in the same bucket and
      subjected to the limitations of the other subtypes.
      Signed-off-by: NAnand Avati <avati@redhat.com>
      Signed-off-by: NEric Paris <eparis@redhat.com>
      102aefdd
  7. 20 8月, 2013 1 次提交
  8. 15 8月, 2013 15 次提交
  9. 13 8月, 2013 1 次提交
    • R
      Smack: parse multiple rules per write to load2, up to PAGE_SIZE-1 bytes · 10289b0f
      Rafal Krypa 提交于
      Smack interface for loading rules has always parsed only single rule from
      data written to it. This requires user program to call one write() per
      each rule it wants to load.
      This change makes it possible to write multiple rules, separated by new
      line character. Smack will load at most PAGE_SIZE-1 characters and properly
      return number of processed bytes. In case when user buffer is larger, it
      will be additionally truncated. All characters after last \n will not get
      parsed to avoid partial rule near input buffer boundary.
      Signed-off-by: NRafal Krypa <r.krypa@samsung.com>
      10289b0f
  10. 09 8月, 2013 7 次提交
    • T
      cgroup: make css_for_each_descendant() and friends include the origin css in the iteration · bd8815a6
      Tejun Heo 提交于
      Previously, all css descendant iterators didn't include the origin
      (root of subtree) css in the iteration.  The reasons were maintaining
      consistency with css_for_each_child() and that at the time of
      introduction more use cases needed skipping the origin anyway;
      however, given that css_is_descendant() considers self to be a
      descendant, omitting the origin css has become more confusing and
      looking at the accumulated use cases rather clearly indicates that
      including origin would result in simpler code overall.
      
      While this is a change which can easily lead to subtle bugs, cgroup
      API including the iterators has recently gone through major
      restructuring and no out-of-tree changes will be applicable without
      adjustments making this a relatively acceptable opportunity for this
      type of change.
      
      The conversions are mostly straight-forward.  If the iteration block
      had explicit origin handling before or after, it's moved inside the
      iteration.  If not, if (pos == origin) continue; is added.  Some
      conversions add extra reference get/put around origin handling by
      consolidating origin handling and the rest.  While the extra ref
      operations aren't strictly necessary, this shouldn't cause any
      noticeable difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      bd8815a6
    • T
      cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup · 492eb21b
      Tejun Heo 提交于
      cgroup is currently in the process of transitioning to using css
      (cgroup_subsys_state) as the primary handle instead of cgroup in
      subsystem API.  For hierarchy iterators, this is beneficial because
      
      * In most cases, css is the only thing subsystems care about anyway.
      
      * On the planned unified hierarchy, iterations for different
        subsystems will need to skip over different subtrees of the
        hierarchy depending on which subsystems are enabled on each cgroup.
        Passing around css makes it unnecessary to explicitly specify the
        subsystem in question as css is intersection between cgroup and
        subsystem
      
      * For the planned unified hierarchy, css's would need to be created
        and destroyed dynamically independent from cgroup hierarchy.  Having
        cgroup core manage css iteration makes enforcing deref rules a lot
        easier.
      
      Most subsystem conversions are straight-forward.  Noteworthy changes
      are
      
      * blkio: cgroup_to_blkcg() is no longer used.  Removed.
      
      * freezer: cgroup_freezer() is no longer used.  Removed.
      
      * devices: cgroup_to_devcgroup() is no longer used.  Removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      492eb21b
    • T
      cgroup: pass around cgroup_subsys_state instead of cgroup in file methods · 182446d0
      Tejun Heo 提交于
      cgroup is currently in the process of transitioning to using struct
      cgroup_subsys_state * as the primary handle instead of struct cgroup.
      Please see the previous commit which converts the subsystem methods
      for rationale.
      
      This patch converts all cftype file operations to take @css instead of
      @cgroup.  cftypes for the cgroup core files don't have their subsytem
      pointer set.  These will automatically use the dummy_css added by the
      previous patch and can be converted the same way.
      
      Most subsystem conversions are straight forwards but there are some
      interesting ones.
      
      * freezer: update_if_frozen() is also converted to take @css instead
        of @cgroup for consistency.  This will make the code look simpler
        too once iterators are converted to use css.
      
      * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
        vmpressure while mem_cgroup_from_cont() can be made static.
        Updated accordingly.
      
      * cpu: cgroup_tg() doesn't have any user left.  Removed.
      
      * cpuacct: cgroup_ca() doesn't have any user left.  Removed.
      
      * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
        Removed.
      
      * net_cls: cgrp_cls_state() doesn't have any user left.  Removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      182446d0
    • T
      cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods · eb95419b
      Tejun Heo 提交于
      cgroup is currently in the process of transitioning to using struct
      cgroup_subsys_state * as the primary handle instead of struct cgroup *
      in subsystem implementations for the following reasons.
      
      * With unified hierarchy, subsystems will be dynamically bound and
        unbound from cgroups and thus css's (cgroup_subsys_state) may be
        created and destroyed dynamically over the lifetime of a cgroup,
        which is different from the current state where all css's are
        allocated and destroyed together with the associated cgroup.  This
        in turn means that cgroup_css() should be synchronized and may
        return NULL, making it more cumbersome to use.
      
      * Differing levels of per-subsystem granularity in the unified
        hierarchy means that the task and descendant iterators should behave
        differently depending on the specific subsystem the iteration is
        being performed for.
      
      * In majority of the cases, subsystems only care about its part in the
        cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
        often obtain the matching css pointer from the cgroup and don't
        bother with the cgroup pointer itself.  Passing around css fits
        much better.
      
      This patch converts all cgroup_subsys methods to take @css instead of
      @cgroup.  The conversions are mostly straight-forward.  A few
      noteworthy changes are
      
      * ->css_alloc() now takes css of the parent cgroup rather than the
        pointer to the new cgroup as the css for the new cgroup doesn't
        exist yet.  Knowing the parent css is enough for all the existing
        subsystems.
      
      * In kernel/cgroup.c::offline_css(), unnecessary open coded css
        dereference is replaced with local variable access.
      
      This patch shouldn't cause any behavior differences.
      
      v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced
          with local variable @css as suggested by Li Zefan.
      
          Rebased on top of new for-3.12 which includes for-3.11-fixes so
          that ->css_free() invocation added by da0a12ca ("cgroup: fix a
          leak when percpu_ref_init() fails") is converted too.  Suggested
          by Li Zefan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Acked-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      eb95419b
    • T
      cgroup: add css_parent() · 63876986
      Tejun Heo 提交于
      Currently, controllers have to explicitly follow the cgroup hierarchy
      to find the parent of a given css.  cgroup is moving towards using
      cgroup_subsys_state as the main controller interface construct, so
      let's provide a way to climb the hierarchy using just csses.
      
      This patch implements css_parent() which, given a css, returns its
      parent.  The function is guarnateed to valid non-NULL parent css as
      long as the target css is not at the top of the hierarchy.
      
      freezer, cpuset, cpu, cpuacct, hugetlb, memory, net_cls and devices
      are converted to use css_parent() instead of accessing cgroup->parent
      directly.
      
      * __parent_ca() is dropped from cpuacct and its usage is replaced with
        parent_ca().  The only difference between the two was NULL test on
        cgroup->parent which is now embedded in css_parent() making the
        distinction moot.  Note that eventually a css->parent field will be
        added to css and the NULL check in css_parent() will go away.
      
      This patch shouldn't cause any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      63876986
    • T
      cgroup: add/update accessors which obtain subsys specific data from css · a7c6d554
      Tejun Heo 提交于
      css (cgroup_subsys_state) is usually embedded in a subsys specific
      data structure.  Subsystems either use container_of() directly to cast
      from css to such data structure or has an accessor function wrapping
      such cast.  As cgroup as whole is moving towards using css as the main
      interface handle, add and update such accessors to ease dealing with
      css's.
      
      All accessors explicitly handle NULL input and return NULL in those
      cases.  While this looks like an extra branch in the code, as all
      controllers specific data structures have css as the first field, the
      casting doesn't involve any offsetting and the compiler can trivially
      optimize out the branch.
      
      * blkio, freezer, cpuset, cpu, cpuacct and net_cls didn't have such
        accessor.  Added.
      
      * memory, hugetlb and devices already had one but didn't explicitly
        handle NULL input.  Updated.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      a7c6d554
    • T
      cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ · 8af01f56
      Tejun Heo 提交于
      The names of the two struct cgroup_subsys_state accessors -
      cgroup_subsys_state() and task_subsys_state() - are somewhat awkward.
      The former clashes with the type name and the latter doesn't even
      indicate it's somehow related to cgroup.
      
      We're about to revamp large portion of cgroup API, so, let's rename
      them so that they're less awkward.  Most per-controller usages of the
      accessors are localized in accessor wrappers and given the amount of
      scheduled changes, this isn't gonna add any noticeable headache.
      
      Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state()
      to task_css().  This patch is pure rename.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      8af01f56
  11. 06 8月, 2013 1 次提交
  12. 02 8月, 2013 2 次提交
    • C
      Smack: network label match fix · 677264e8
      Casey Schaufler 提交于
      The Smack code that matches incoming CIPSO tags with Smack labels
      reaches through the NetLabel interfaces and compares the network
      data with the CIPSO header associated with a Smack label. This was
      done in a ill advised attempt to optimize performance. It works
      so long as the categories fit in a single capset, but this isn't
      always the case.
      
      This patch changes the Smack code to use the appropriate NetLabel
      interfaces to compare the incoming CIPSO header with the CIPSO
      header associated with a label. It will always match the CIPSO
      headers correctly.
      
      Targeted for git://git.gitorious.org/smack-next/kernel.gitSigned-off-by: NCasey Schaufler <casey@schaufler-ca.com>
      677264e8
    • T
      security: smack: add a hash table to quicken smk_find_entry() · 4d7cf4a1
      Tomasz Stanislawski 提交于
      Accepted for the smack-next tree after changing the number of
      slots from 128 to 16.
      
      This patch adds a hash table to quicken searching of a smack label by its name.
      
      Basically, the patch improves performance of SMACK initialization.  Parsing of
      rules involves translation from a string to a smack_known (aka label) entity
      which is done in smk_find_entry().
      
      The current implementation of the function iterates over a global list of
      smack_known resulting in O(N) complexity for smk_find_entry().  The total
      complexity of SMACK initialization becomes O(rules * labels).  Therefore it
      scales quadratically with a complexity of a system.
      
      Applying the patch reduced the complexity of smk_find_entry() to O(1) as long
      as number of label is in hundreds. If the number of labels is increased please
      update SMACK_HASH_SLOTS constant defined in security/smack/smack.h. Introducing
      the configuration of this constant with Kconfig or cmdline might be a good
      idea.
      
      The size of the hash table was adjusted experimentally.  The rule set used by
      TIZEN contains circa 17K rules for 500 labels.  The table above contains
      results of SMACK initialization using 'time smackctl apply' bash command.
      The 'Ref' is a kernel without this patch applied. The consecutive values
      refers to value of SMACK_HASH_SLOTS.  Every measurement was repeated three
      times to reduce noise.
      
           |  Ref  |   1   |   2   |   4   |   8   |   16  |   32  |   64  |  128  |  256  |  512
      --------------------------------------------------------------------------------------------
      Run1 | 1.156 | 1.096 | 0.883 | 0.764 | 0.692 | 0.667 | 0.649 | 0.633 | 0.634 | 0.629 | 0.620
      Run2 | 1.156 | 1.111 | 0.885 | 0.764 | 0.694 | 0.661 | 0.649 | 0.651 | 0.634 | 0.638 | 0.623
      Run3 | 1.160 | 1.107 | 0.886 | 0.764 | 0.694 | 0.671 | 0.661 | 0.638 | 0.631 | 0.624 | 0.638
      AVG  | 1.157 | 1.105 | 0.885 | 0.764 | 0.693 | 0.666 | 0.653 | 0.641 | 0.633 | 0.630 | 0.627
      
      Surprisingly, a single hlist is slightly faster than a double-linked list.
      The speed-up saturates near 64 slots.  Therefore I chose value 128 to provide
      some margin if more labels were used.
      It looks that IO becomes a new bottleneck.
      Signed-off-by: NTomasz Stanislawski <t.stanislaws@samsung.com>
      4d7cf4a1