1. 19 10月, 2020 2 次提交
    • R
      mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current() · 279c3393
      Roman Gushchin 提交于
      Patch series "mm: kmem: kernel memory accounting in an interrupt context".
      
      This patchset implements memcg-based memory accounting of allocations made
      from an interrupt context.
      
      Historically, such allocations were passed unaccounted mostly because
      charging the memory cgroup of the current process wasn't an option.  Also
      performance reasons were likely a reason too.
      
      The remote charging API allows to temporarily overwrite the currently
      active memory cgroup, so that all memory allocations are accounted towards
      some specified memory cgroup instead of the memory cgroup of the current
      process.
      
      This patchset extends the remote charging API so that it can be used from
      an interrupt context.  Then it removes the fence that prevented the
      accounting of allocations made from an interrupt context.  It also
      contains a couple of optimizations/code refactorings.
      
      This patchset doesn't directly enable accounting for any specific
      allocations, but prepares the code base for it.  The bpf memory accounting
      will likely be the first user of it: a typical example is a bpf program
      parsing an incoming network packet, which allocates an entry in hashmap
      map to store some information.
      
      This patch (of 4):
      
      Currently memcg_kmem_bypass() is called before obtaining the current
      memory/obj cgroup using get_mem/obj_cgroup_from_current().  Moving
      memcg_kmem_bypass() into get_mem/obj_cgroup_from_current() reduces the
      number of call sites and allows further code simplifications.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NShakeel Butt <shakeelb@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Link: http://lkml.kernel.org/r/20200827225843.1270629-1-guro@fb.com
      Link: http://lkml.kernel.org/r/20200827225843.1270629-2-guro@fb.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      279c3393
    • R
      mm, memcg: rework remote charging API to support nesting · b87d8cef
      Roman Gushchin 提交于
      Currently the remote memcg charging API consists of two functions:
      memalloc_use_memcg() and memalloc_unuse_memcg(), which set and clear the
      memcg value, which overwrites the memcg of the current task.
      
        memalloc_use_memcg(target_memcg);
        <...>
        memalloc_unuse_memcg();
      
      It works perfectly for allocations performed from a normal context,
      however an attempt to call it from an interrupt context or just nest two
      remote charging blocks will lead to an incorrect accounting.  On exit from
      the inner block the active memcg will be cleared instead of being
      restored.
      
        memalloc_use_memcg(target_memcg);
      
        memalloc_use_memcg(target_memcg_2);
          <...>
          memalloc_unuse_memcg();
      
          Error: allocation here are charged to the memcg of the current
          process instead of target_memcg.
      
        memalloc_unuse_memcg();
      
      This patch extends the remote charging API by switching to a single
      function: struct mem_cgroup *set_active_memcg(struct mem_cgroup *memcg),
      which sets the new value and returns the old one.  So a remote charging
      block will look like:
      
        old_memcg = set_active_memcg(target_memcg);
        <...>
        set_active_memcg(old_memcg);
      
      This patch is heavily based on the patch by Johannes Weiner, which can be
      found here: https://lkml.org/lkml/2020/5/28/806 .
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NShakeel Butt <shakeelb@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Dan Schatzberg <dschatzberg@fb.com>
      Link: https://lkml.kernel.org/r/20200821212056.3769116-1-guro@fb.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b87d8cef
  2. 14 10月, 2020 11 次提交
  3. 27 9月, 2020 1 次提交
  4. 25 9月, 2020 1 次提交
  5. 06 9月, 2020 1 次提交
    • M
      memcg: fix use-after-free in uncharge_batch · f1796544
      Michal Hocko 提交于
      syzbot has reported an use-after-free in the uncharge_batch path
      
        BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline]
        BUG: KASAN: use-after-free in atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
        BUG: KASAN: use-after-free in atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
        BUG: KASAN: use-after-free in page_counter_cancel mm/page_counter.c:54 [inline]
        BUG: KASAN: use-after-free in page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
        Write of size 8 at addr ffff8880371c0148 by task syz-executor.0/9304
      
        CPU: 0 PID: 9304 Comm: syz-executor.0 Not tainted 5.8.0-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Call Trace:
          __dump_stack lib/dump_stack.c:77 [inline]
          dump_stack+0x1f0/0x31e lib/dump_stack.c:118
          print_address_description+0x66/0x620 mm/kasan/report.c:383
          __kasan_report mm/kasan/report.c:513 [inline]
          kasan_report+0x132/0x1d0 mm/kasan/report.c:530
          check_memory_region_inline mm/kasan/generic.c:183 [inline]
          check_memory_region+0x2b5/0x2f0 mm/kasan/generic.c:192
          instrument_atomic_write include/linux/instrumented.h:71 [inline]
          atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
          atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
          page_counter_cancel mm/page_counter.c:54 [inline]
          page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
          uncharge_batch+0x6c/0x350 mm/memcontrol.c:6764
          uncharge_page+0x115/0x430 mm/memcontrol.c:6796
          uncharge_list mm/memcontrol.c:6835 [inline]
          mem_cgroup_uncharge_list+0x70/0xe0 mm/memcontrol.c:6877
          release_pages+0x13a2/0x1550 mm/swap.c:911
          tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
          tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
          tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249
          tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:328
          exit_mmap+0x296/0x550 mm/mmap.c:3185
          __mmput+0x113/0x370 kernel/fork.c:1076
          exit_mm+0x4cd/0x550 kernel/exit.c:483
          do_exit+0x576/0x1f20 kernel/exit.c:793
          do_group_exit+0x161/0x2d0 kernel/exit.c:903
          get_signal+0x139b/0x1d30 kernel/signal.c:2743
          arch_do_signal+0x33/0x610 arch/x86/kernel/signal.c:811
          exit_to_user_mode_loop kernel/entry/common.c:135 [inline]
          exit_to_user_mode_prepare+0x8d/0x1b0 kernel/entry/common.c:166
          syscall_exit_to_user_mode+0x5e/0x1a0 kernel/entry/common.c:241
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Commit 1a3e1f40 ("mm: memcontrol: decouple reference counting from
      page accounting") reworked the memcg lifetime to be bound the the struct
      page rather than charges.  It also removed the css_put_many from
      uncharge_batch and that is causing the above splat.
      
      uncharge_batch() is supposed to uncharge accumulated charges for all
      pages freed from the same memcg.  The queuing is done by uncharge_page
      which however drops the memcg reference after it adds charges to the
      batch.  If the current page happens to be the last one holding the
      reference for its memcg then the memcg is OK to go and the next page to
      be freed will trigger batched uncharge which needs to access the memcg
      which is gone already.
      
      Fix the issue by taking a reference for the memcg in the current batch.
      
      Fixes: 1a3e1f40 ("mm: memcontrol: decouple reference counting from page accounting")
      Reported-by: syzbot+b305848212deec86eabe@syzkaller.appspotmail.com
      Reported-by: syzbot+b5ea6fb6f139c8b9482b@syzkaller.appspotmail.com
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NShakeel Butt <shakeelb@google.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Hugh Dickins <hughd@google.com>
      Link: https://lkml.kernel.org/r/20200820090341.GC5033@dhcp22.suse.czSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1796544
  6. 15 8月, 2020 1 次提交
  7. 14 8月, 2020 1 次提交
    • J
      mm: memcontrol: fix warning when allocating the root cgroup · 9f457179
      Johannes Weiner 提交于
      Commit 3e38e0aa ("mm: memcg: charge memcg percpu memory to the
      parent cgroup") adds memory tracking to the memcg kernel structures
      themselves to make cgroups liable for the memory they are consuming
      through the allocation of child groups (which can be significant).
      
      This code is a bit awkward as it's spread out through several functions:
      The outermost function does memalloc_use_memcg(parent) to set up
      current->active_memcg, which designates which cgroup to charge, and the
      inner functions pass GFP_ACCOUNT to request charging for specific
      allocations.  To make sure this dependency is satisfied at all times -
      to make sure we don't randomly charge whoever is calling the functions -
      the inner functions warn on !current->active_memcg.
      
      However, this triggers a false warning when the root memcg itself is
      allocated.  No parent exists in this case, and so current->active_memcg
      is rightfully NULL.  It's a false positive, not indicative of a bug.
      
      Delete the warnings for now, we can revisit this later.
      
      Fixes: 3e38e0aa ("mm: memcg: charge memcg percpu memory to the parent cgroup")
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NRoman Gushchin <guro@fb.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9f457179
  8. 13 8月, 2020 4 次提交
  9. 08 8月, 2020 18 次提交