1. 21 1月, 2016 16 次提交
  2. 16 1月, 2016 6 次提交
  3. 15 1月, 2016 12 次提交
  4. 30 12月, 2015 1 次提交
    • V
      mm: memcontrol: fix possible memcg leak due to interrupted reclaim · 6df38689
      Vladimir Davydov 提交于
      Memory cgroup reclaim can be interrupted with mem_cgroup_iter_break()
      once enough pages have been reclaimed, in which case, in contrast to a
      full round-trip over a cgroup sub-tree, the current position stored in
      mem_cgroup_reclaim_iter of the target cgroup does not get invalidated
      and so is left holding the reference to the last scanned cgroup.  If the
      target cgroup does not get scanned again (we might have just reclaimed
      the last page or all processes might exit and free their memory
      voluntary), we will leak it, because there is nobody to put the
      reference held by the iterator.
      
      The problem is easy to reproduce by running the following command
      sequence in a loop:
      
          mkdir /sys/fs/cgroup/memory/test
          echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
          echo $$ > /sys/fs/cgroup/memory/test/cgroup.procs
          memhog 150M
          echo $$ > /sys/fs/cgroup/memory/cgroup.procs
          rmdir test
      
      The cgroups generated by it will never get freed.
      
      This patch fixes this issue by making mem_cgroup_iter avoid taking
      reference to the current position.  In order not to hit use-after-free
      bug while running reclaim in parallel with cgroup deletion, we make use
      of ->css_released cgroup callback to clear references to the dying
      cgroup in all reclaim iterators that might refer to it.  This callback
      is called right before scheduling rcu work which will free css, so if we
      access iter->position from rcu read section, we might be sure it won't
      go away under us.
      
      [hannes@cmpxchg.org: clean up css ref handling]
      Fixes: 5ac8fb31 ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>	[3.19+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6df38689
  5. 28 12月, 2015 1 次提交
    • R
      cgroup: Fix uninitialized variable warning · eed67d75
      Ross Zwisler 提交于
      Commit 1f7dd3e5 ("cgroup: fix handling of multi-destination migration
      from subtree_control enabling") introduced the following compiler warning:
      
      mm/memcontrol.c: In function ‘mem_cgroup_can_attach’:
      mm/memcontrol.c:4790:9: warning: ‘memcg’ may be used uninitialized in this function [-Wmaybe-uninitialized]
         mc.to = memcg;
               ^
      
      Fix this by initializing 'memcg' to NULL.
      
      This was found using gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6).
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      eed67d75
  6. 13 12月, 2015 2 次提交
  7. 03 12月, 2015 1 次提交
    • T
      cgroup: fix handling of multi-destination migration from subtree_control enabling · 1f7dd3e5
      Tejun Heo 提交于
      Consider the following v2 hierarchy.
      
        P0 (+memory) --- P1 (-memory) --- A
                                       \- B
             
      P0 has memory enabled in its subtree_control while P1 doesn't.  If
      both A and B contain processes, they would belong to the memory css of
      P1.  Now if memory is enabled on P1's subtree_control, memory csses
      should be created on both A and B and A's processes should be moved to
      the former and B's processes the latter.  IOW, enabling controllers
      can cause atomic migrations into different csses.
      
      The core cgroup migration logic has been updated accordingly but the
      controller migration methods haven't and still assume that all tasks
      migrate to a single target css; furthermore, the methods were fed the
      css in which subtree_control was updated which is the parent of the
      target csses.  pids controller depends on the migration methods to
      move charges and this made the controller attribute charges to the
      wrong csses often triggering the following warning by driving a
      counter negative.
      
       WARNING: CPU: 1 PID: 1 at kernel/cgroup_pids.c:97 pids_cancel.constprop.6+0x31/0x40()
       Modules linked in:
       CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #29
       ...
        ffffffff81f65382 ffff88007c043b90 ffffffff81551ffc 0000000000000000
        ffff88007c043bc8 ffffffff810de202 ffff88007a752000 ffff88007a29ab00
        ffff88007c043c80 ffff88007a1d8400 0000000000000001 ffff88007c043bd8
       Call Trace:
        [<ffffffff81551ffc>] dump_stack+0x4e/0x82
        [<ffffffff810de202>] warn_slowpath_common+0x82/0xc0
        [<ffffffff810de2fa>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8118e031>] pids_cancel.constprop.6+0x31/0x40
        [<ffffffff8118e0fd>] pids_can_attach+0x6d/0xf0
        [<ffffffff81188a4c>] cgroup_taskset_migrate+0x6c/0x330
        [<ffffffff81188e05>] cgroup_migrate+0xf5/0x190
        [<ffffffff81189016>] cgroup_attach_task+0x176/0x200
        [<ffffffff8118949d>] __cgroup_procs_write+0x2ad/0x460
        [<ffffffff81189684>] cgroup_procs_write+0x14/0x20
        [<ffffffff811854e5>] cgroup_file_write+0x35/0x1c0
        [<ffffffff812e26f1>] kernfs_fop_write+0x141/0x190
        [<ffffffff81265f88>] __vfs_write+0x28/0xe0
        [<ffffffff812666fc>] vfs_write+0xac/0x1a0
        [<ffffffff81267019>] SyS_write+0x49/0xb0
        [<ffffffff81bcef32>] entry_SYSCALL_64_fastpath+0x12/0x76
      
      This patch fixes the bug by removing @css parameter from the three
      migration methods, ->can_attach, ->cancel_attach() and ->attach() and
      updating cgroup_taskset iteration helpers also return the destination
      css in addition to the task being migrated.  All controllers are
      updated accordingly.
      
      * Controllers which don't care whether there are one or multiple
        target csses can be converted trivially.  cpu, io, freezer, perf,
        netclassid and netprio fall in this category.
      
      * cpuset's current implementation assumes that there's single source
        and destination and thus doesn't support v2 hierarchy already.  The
        only change made by this patchset is how that single destination css
        is obtained.
      
      * memory migration path already doesn't do anything on v2.  How the
        single destination css is obtained is updated and the prep stage of
        mem_cgroup_can_attach() is reordered to accomodate the change.
      
      * pids is the only controller which was affected by this bug.  It now
        correctly handles multi-destination migrations and no longer causes
        counter underflow from incorrect accounting.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-and-tested-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Aleksa Sarai <cyphar@cyphar.com>
      1f7dd3e5
  8. 07 11月, 2015 1 次提交