1. 16 12月, 2020 4 次提交
  2. 15 11月, 2020 1 次提交
    • L
      mm/slub: fix panic in slab_alloc_node() · 22e4663e
      Laurent Dufour 提交于
      While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs
      with 11TB of ram, I hit the following panic:
      
          BUG: Kernel NULL pointer dereference on read at 0x00000007
          Faulting instruction address: 0xc000000000456048
          Oops: Kernel access of bad area, sig: 11 [#2]
          LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS= 2048 NUMA pSeries
          Modules linked in: rpadlpar_io rpaphp
          CPU: 160 PID: 1 Comm: systemd Tainted: G      D           5.9.0 #1
          NIP:  c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350
          REGS: c00006028d1b77a0 TRAP: 0300   Tainted: G      D            (5.9.0)
          MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004228  XER: 00000000
          CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0
          GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000
          GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320
          GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000
          GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a
          GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1
          GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8
          GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000
          GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001
          NIP [c000000000456048] __kmalloc_node+0x108/0x790
          LR [c000000000455fd4] __kmalloc_node+0x94/0x790
          Call Trace:
            kvmalloc_node+0x58/0x110
            mem_cgroup_css_online+0x10c/0x270
            online_css+0x48/0xd0
            cgroup_apply_control_enable+0x2c4/0x470
            cgroup_mkdir+0x408/0x5f0
            kernfs_iop_mkdir+0x90/0x100
            vfs_mkdir+0x138/0x250
            do_mkdirat+0x154/0x1c0
            system_call_exception+0xf8/0x200
            system_call_common+0xf0/0x27c
          Instruction dump:
          e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a
          2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78
      
      This pointing to the following code:
      
          mm/slub.c:2851
                  if (unlikely(!object || !node_match(page, node))) {
          c000000000456038:       00 00 bc 2f     cmpdi   cr7,r28,0
          c00000000045603c:       18 00 9e 41     beq     cr7,c000000000456054 <__kmalloc_node+0x114>
          node_match():
          mm/slub.c:2491
                  if (node != NUMA_NO_NODE && page_to_nid(page) != node)
          c000000000456040:       30 02 92 41     beq     cr4,c000000000456270 <__kmalloc_node+0x330>
          page_to_nid():
          include/linux/mm.h:1294
          c000000000456044:       10 00 27 e9     ld      r9,16(r7)
          c000000000456048:       07 00 29 89     lbz     r9,7(r9)	<<<< r9 = NULL
          node_match():
          mm/slub.c:2491
          c00000000045604c:       00 48 99 7f     cmpw    cr7,r25,r9
          c000000000456050:       20 02 9e 41     beq     cr7,c000000000456270 <__kmalloc_node+0x330>
      
      The panic occurred in slab_alloc_node() when checking for the page's node:
      
      	object = c->freelist;
      	page = c->page;
      	if (unlikely(!object || !node_match(page, node))) {
      		object = __slab_alloc(s, gfpflags, node, addr, c);
      		stat(s, ALLOC_SLOWPATH);
      
      The issue is that object is not NULL while page is NULL which is odd but
      may happen if the cache flush happened after loading object but before
      loading page.  Thus checking for the page pointer is required too.
      
      The cache flush is done through an inter processor interrupt when a
      piece of memory is off-lined.  That interrupt is triggered when a memory
      hot-unplug operation is initiated and offline_pages() is calling the
      slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback()
      which is calling flush_cpu_slab().  If that interrupt is caught between
      the reading of c->freelist and the reading of c->page, this could lead
      to such a situation.  That situation is expected and the later call to
      this_cpu_cmpxchg_double() will detect the change to c->freelist and redo
      the whole operation.
      
      In commit 6159d0f5 ("mm/slub.c: page is always non-NULL in
      node_match()") check on the page pointer has been removed assuming that
      page is always valid when it is called.  It happens that this is not
      true in that particular case, so check for page before calling
      node_match() here.
      
      Fixes: 6159d0f5 ("mm/slub.c: page is always non-NULL in node_match()")
      Signed-off-by: NLaurent Dufour <ldufour@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Nathan Lynch <nathanl@linux.ibm.com>
      Cc: Scott Cheloha <cheloha@linux.ibm.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22e4663e
  3. 17 10月, 2020 1 次提交
  4. 14 10月, 2020 4 次提交
  5. 04 10月, 2020 1 次提交
  6. 06 9月, 2020 1 次提交
  7. 08 8月, 2020 21 次提交
  8. 17 7月, 2020 1 次提交
  9. 26 6月, 2020 1 次提交
  10. 18 6月, 2020 1 次提交
  11. 05 6月, 2020 1 次提交
  12. 04 6月, 2020 2 次提交
    • J
      mm/page_alloc: integrate classzone_idx and high_zoneidx · 97a225e6
      Joonsoo Kim 提交于
      classzone_idx is just different name for high_zoneidx now.  So, integrate
      them and add some comment to struct alloc_context in order to reduce
      future confusion about the meaning of this variable.
      
      The accessor, ac_classzone_idx() is also removed since it isn't needed
      after integration.
      
      In addition to integration, this patch also renames high_zoneidx to
      highest_zoneidx since it represents more precise meaning.
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NBaoquan He <bhe@redhat.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Ye Xiaolong <xiaolong.ye@intel.com>
      Link: http://lkml.kernel.org/r/1587095923-7515-3-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97a225e6
    • W
      mm/slub: fix a memory leak in sysfs_slab_add() · dde3c6b7
      Wang Hai 提交于
      syzkaller reports for memory leak when kobject_init_and_add() returns an
      error in the function sysfs_slab_add() [1]
      
      When this happened, the function kobject_put() is not called for the
      corresponding kobject, which potentially leads to memory leak.
      
      This patch fixes the issue by calling kobject_put() even if
      kobject_init_and_add() fails.
      
      [1]
        BUG: memory leak
        unreferenced object 0xffff8880a6d4be88 (size 8):
        comm "syz-executor.3", pid 946, jiffies 4295772514 (age 18.396s)
        hex dump (first 8 bytes):
          70 69 64 5f 33 00 ff ff                          pid_3...
        backtrace:
           kstrdup+0x35/0x70 mm/util.c:60
           kstrdup_const+0x3d/0x50 mm/util.c:82
           kvasprintf_const+0x112/0x170 lib/kasprintf.c:48
           kobject_set_name_vargs+0x55/0x130 lib/kobject.c:289
           kobject_add_varg lib/kobject.c:384 [inline]
           kobject_init_and_add+0xd8/0x170 lib/kobject.c:473
           sysfs_slab_add+0x1d8/0x290 mm/slub.c:5811
           __kmem_cache_create+0x50a/0x570 mm/slub.c:4384
           create_cache+0x113/0x1e0 mm/slab_common.c:407
           kmem_cache_create_usercopy+0x1a1/0x260 mm/slab_common.c:505
           kmem_cache_create+0xd/0x10 mm/slab_common.c:564
           create_pid_cachep kernel/pid_namespace.c:54 [inline]
           create_pid_namespace kernel/pid_namespace.c:96 [inline]
           copy_pid_ns+0x77c/0x8f0 kernel/pid_namespace.c:148
           create_new_namespaces+0x26b/0xa30 kernel/nsproxy.c:95
           unshare_nsproxy_namespaces+0xa7/0x1e0 kernel/nsproxy.c:229
           ksys_unshare+0x3d2/0x770 kernel/fork.c:2969
           __do_sys_unshare kernel/fork.c:3037 [inline]
           __se_sys_unshare kernel/fork.c:3035 [inline]
           __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3035
           do_syscall_64+0xa1/0x530 arch/x86/entry/common.c:295
      
      Fixes: 80da026a ("mm/slub: fix slab double-free in case of duplicate sysfs filename")
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NWang Hai <wanghai38@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Link: http://lkml.kernel.org/r/20200602115033.1054-1-wanghai38@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dde3c6b7
  13. 03 6月, 2020 1 次提交
    • Q
      mm/slub: fix stack overruns with SLUB_STATS · a68ee057
      Qian Cai 提交于
      There is no need to copy SLUB_STATS items from root memcg cache to new
      memcg cache copies.  Doing so could result in stack overruns because the
      store function only accepts 0 to clear the stat and returns an error for
      everything else while the show method would print out the whole stat.
      
      Then, the mismatch of the lengths returns from show and store methods
      happens in memcg_propagate_slab_attrs():
      
      	else if (root_cache->max_attr_size < ARRAY_SIZE(mbuf))
      		buf = mbuf;
      
      max_attr_size is only 2 from slab_attr_store(), then, it uses mbuf[64]
      in show_stat() later where a bounch of sprintf() would overrun the stack
      variable.  Fix it by always allocating a page of buffer to be used in
      show_stat() if SLUB_STATS=y which should only be used for debug purpose.
      
        # echo 1 > /sys/kernel/slab/fs_cache/shrink
        BUG: KASAN: stack-out-of-bounds in number+0x421/0x6e0
        Write of size 1 at addr ffffc900256cfde0 by task kworker/76:0/53251
      
        Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
        Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
        Call Trace:
          number+0x421/0x6e0
          vsnprintf+0x451/0x8e0
          sprintf+0x9e/0xd0
          show_stat+0x124/0x1d0
          alloc_slowpath_show+0x13/0x20
          __kmem_cache_create+0x47a/0x6b0
      
        addr ffffc900256cfde0 is located in stack of task kworker/76:0/53251 at offset 0 in frame:
         process_one_work+0x0/0xb90
      
        this frame has 1 object:
         [32, 72) 'lockdep_map'
      
        Memory state around the buggy address:
         ffffc900256cfc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         ffffc900256cfd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        >ffffc900256cfd80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
                                                               ^
         ffffc900256cfe00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
         ffffc900256cfe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ==================================================================
        Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: __kmem_cache_create+0x6ac/0x6b0
        Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
        Call Trace:
          __kmem_cache_create+0x6ac/0x6b0
      
      Fixes: 107dab5c ("slub: slub-specific propagation changes")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Glauber Costa <glauber@scylladb.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Link: http://lkml.kernel.org/r/20200429222356.4322-1-cai@lca.pwSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a68ee057