1. 20 5月, 2016 5 次提交
    • M
      oom, oom_reaper: try to reap tasks which skip regular OOM killer path · 3ef22dff
      Michal Hocko 提交于
      If either the current task is already killed or PF_EXITING or a selected
      task is PF_EXITING then the oom killer is suppressed and so is the oom
      reaper.  This patch adds try_oom_reaper which checks the given task and
      queues it for the oom reaper if that is safe to be done meaning that the
      task doesn't share the mm with an alive process.
      
      This might help to release the memory pressure while the task tries to
      exit.
      
      [akpm@linux-foundation.org: fix nommu build]
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Raushaniya Maksudova <rmaksudova@parallels.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Daniel Vetter <daniel.vetter@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ef22dff
    • H
      mm: update_lru_size do the __mod_zone_page_state · 9d5e6a9f
      Hugh Dickins 提交于
      Konstantin Khlebnikov pointed out (nearly four years ago, when lumpy
      reclaim was removed) that lru_size can be updated by -nr_taken once per
      call to isolate_lru_pages(), instead of page by page.
      
      Update it inside isolate_lru_pages(), or at its two callsites? I chose
      to update it at the callsites, rearranging and grouping the updates by
      nr_taken and nr_scanned together in both.
      
      With one exception, mem_cgroup_update_lru_size(,lru,) is then used where
      __mod_zone_page_state(,NR_LRU_BASE+lru,) is used; and we shall be adding
      some more calls in a future commit.  Make the code a little smaller and
      simpler by incorporating stat update in lru_size update.
      
      The exception was move_active_pages_to_lru(), which aggregated the
      pgmoved stat update separately from the individual lru_size updates; but
      I still think this a simplification worth making.
      
      However, the __mod_zone_page_state is not peculiar to mem_cgroups: so
      better use the name update_lru_size, calls mem_cgroup_update_lru_size
      when CONFIG_MEMCG.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Cc: Ning Qu <quning@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d5e6a9f
    • H
      mm: update_lru_size warn and reset bad lru_size · ca707239
      Hugh Dickins 提交于
      Though debug kernels have a VM_BUG_ON to help protect from misaccounting
      lru_size, non-debug kernels are liable to wrap it around: and then the
      vast unsigned long size draws page reclaim into a loop of repeatedly
      doing nothing on an empty list, without even a cond_resched().
      
      That soft lockup looks confusingly like an over-busy reclaim scenario,
      with lots of contention on the lru_lock in shrink_inactive_list(): yet
      has a totally different origin.
      
      Help differentiate with a custom warning in
      mem_cgroup_update_lru_size(), even in non-debug kernels; and reset the
      size to avoid the lockup.  But the particular bug which suggested this
      change was mine alone, and since fixed.
      
      Make it a WARN_ONCE: the first occurrence is the most informative, a
      flurry may follow, yet even when rate-limited little more is learnt.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Cc: Ning Qu <quning@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca707239
    • M
      mm/memcontrol.c:mem_cgroup_select_victim_node(): clarify comment · fda3d69b
      Michal Hocko 提交于
      > The comment seems to have not much to do with the code?
      
      I guess the comment tries to say that the code path is triggered when we
      charge the page which happens _before_ it is added to the LRU list and
      so last_scanned_node might contain the stale data.
      
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fda3d69b
    • A
      include/linux/nodemask.h: create next_node_in() helper · 0edaf86c
      Andrew Morton 提交于
      Lots of code does
      
      	node = next_node(node, XXX);
      	if (node == MAX_NUMNODES)
      		node = first_node(XXX);
      
      so create next_node_in() to do this and use it in various places.
      
      [mhocko@suse.com: use next_node_in() helper]
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@kernel.org>
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Hui Zhu <zhuhui@xiaomi.com>
      Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0edaf86c
  2. 26 4月, 2016 1 次提交
    • T
      memcg: relocate charge moving from ->attach to ->post_attach · 264a0ae1
      Tejun Heo 提交于
      Hello,
      
      So, this ended up a lot simpler than I originally expected.  I tested
      it lightly and it seems to work fine.  Petr, can you please test these
      two patches w/o the lru drain drop patch and see whether the problem
      is gone?
      
      Thanks.
      ------ 8< ------
      If charge moving is used, memcg performs relabeling of the affected
      pages from its ->attach callback which is called under both
      cgroup_threadgroup_rwsem and thus can't create new kthreads.  This is
      fragile as various operations may depend on workqueues making forward
      progress which relies on the ability to create new kthreads.
      
      There's no reason to perform charge moving from ->attach which is deep
      in the task migration path.  Move it to ->post_attach which is called
      after the actual migration is finished and cgroup_threadgroup_rwsem is
      dropped.
      
      * move_charge_struct->mm is added and ->can_attach is now responsible
        for pinning and recording the target mm.  mem_cgroup_clear_mc() is
        updated accordingly.  This also simplifies mem_cgroup_move_task().
      
      * mem_cgroup_move_task() is now called from ->post_attach instead of
        ->attach.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@kernel.org>
      Debugged-and-tested-by: NPetr Mladek <pmladek@suse.com>
      Reported-by: NCyril Hrubis <chrubis@suse.cz>
      Reported-by: NJohannes Weiner <hannes@cmpxchg.org>
      Fixes: 1ed13287 ("sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem")
      Cc: <stable@vger.kernel.org> # 4.4+
      264a0ae1
  3. 18 3月, 2016 12 次提交
  4. 16 3月, 2016 5 次提交
  5. 22 1月, 2016 1 次提交
  6. 21 1月, 2016 16 次提交