1. 11 11月, 2022 16 次提交
  2. 10 11月, 2022 2 次提交
    • J
      page_alloc: fix invalid watermark check on a negative value · 48bc7241
      Jaewon Kim 提交于
      stable inclusion
      from stable-v5.10.135
      commit 2670f76a563124478d0d14e603b38b73b99c389c
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZWFM
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2670f76a563124478d0d14e603b38b73b99c389c
      
      --------------------------------
      
      commit 9282012f upstream.
      
      There was a report that a task is waiting at the
      throttle_direct_reclaim. The pgscan_direct_throttle in vmstat was
      increasing.
      
      This is a bug where zone_watermark_fast returns true even when the free
      is very low. The commit f27ce0e1 ("page_alloc: consider highatomic
      reserve in watermark fast") changed the watermark fast to consider
      highatomic reserve. But it did not handle a negative value case which
      can be happened when reserved_highatomic pageblock is bigger than the
      actual free.
      
      If watermark is considered as ok for the negative value, allocating
      contexts for order-0 will consume all free pages without direct reclaim,
      and finally free page may become depleted except highatomic free.
      
      Then allocating contexts may fall into throttle_direct_reclaim. This
      symptom may easily happen in a system where wmark min is low and other
      reclaimers like kswapd does not make free pages quickly.
      
      Handle the negative case by using MIN.
      
      Link: https://lkml.kernel.org/r/20220725095212.25388-1-jaewon31.kim@samsung.com
      Fixes: f27ce0e1 ("page_alloc: consider highatomic reserve in watermark fast")
      Signed-off-by: NJaewon Kim <jaewon31.kim@samsung.com>
      Reported-by: NGyeongHwan Hong <gh21.hong@samsung.com>
      Acked-by: NMel Gorman <mgorman@techsingularity.net>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Yong-Taek Lee <ytk.lee@samsung.com>
      Cc: <stable@vger.kerenl.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      48bc7241
    • W
      mm/mempolicy: fix uninit-value in mpol_rebind_policy() · 7057a3c7
      Wang Cheng 提交于
      stable inclusion
      from stable-v5.10.134
      commit ddb3f0b68863bd1c5f43177eea476bce316d4993
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5ZVR7
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ddb3f0b68863bd1c5f43177eea476bce316d4993
      
      --------------------------------
      
      commit 018160ad upstream.
      
      mpol_set_nodemask()(mm/mempolicy.c) does not set up nodemask when
      pol->mode is MPOL_LOCAL.  Check pol->mode before access
      pol->w.cpuset_mems_allowed in mpol_rebind_policy()(mm/mempolicy.c).
      
      BUG: KMSAN: uninit-value in mpol_rebind_policy mm/mempolicy.c:352 [inline]
      BUG: KMSAN: uninit-value in mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368
       mpol_rebind_policy mm/mempolicy.c:352 [inline]
       mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368
       cpuset_change_task_nodemask kernel/cgroup/cpuset.c:1711 [inline]
       cpuset_attach+0x787/0x15e0 kernel/cgroup/cpuset.c:2278
       cgroup_migrate_execute+0x1023/0x1d20 kernel/cgroup/cgroup.c:2515
       cgroup_migrate kernel/cgroup/cgroup.c:2771 [inline]
       cgroup_attach_task+0x540/0x8b0 kernel/cgroup/cgroup.c:2804
       __cgroup1_procs_write+0x5cc/0x7a0 kernel/cgroup/cgroup-v1.c:520
       cgroup1_tasks_write+0x94/0xb0 kernel/cgroup/cgroup-v1.c:539
       cgroup_file_write+0x4c2/0x9e0 kernel/cgroup/cgroup.c:3852
       kernfs_fop_write_iter+0x66a/0x9f0 fs/kernfs/file.c:296
       call_write_iter include/linux/fs.h:2162 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0x1318/0x2030 fs/read_write.c:590
       ksys_write+0x28b/0x510 fs/read_write.c:643
       __do_sys_write fs/read_write.c:655 [inline]
       __se_sys_write fs/read_write.c:652 [inline]
       __x64_sys_write+0xdb/0x120 fs/read_write.c:652
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Uninit was created at:
       slab_post_alloc_hook mm/slab.h:524 [inline]
       slab_alloc_node mm/slub.c:3251 [inline]
       slab_alloc mm/slub.c:3259 [inline]
       kmem_cache_alloc+0x902/0x11c0 mm/slub.c:3264
       mpol_new mm/mempolicy.c:293 [inline]
       do_set_mempolicy+0x421/0xb70 mm/mempolicy.c:853
       kernel_set_mempolicy mm/mempolicy.c:1504 [inline]
       __do_sys_set_mempolicy mm/mempolicy.c:1510 [inline]
       __se_sys_set_mempolicy+0x44c/0xb60 mm/mempolicy.c:1507
       __x64_sys_set_mempolicy+0xd8/0x110 mm/mempolicy.c:1507
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      KMSAN: uninit-value in mpol_rebind_task (2)
      https://syzkaller.appspot.com/bug?id=d6eb90f952c2a5de9ea718a1b873c55cb13b59dc
      
      This patch seems to fix below bug too.
      KMSAN: uninit-value in mpol_rebind_mm (2)
      https://syzkaller.appspot.com/bug?id=f2fecd0d7013f54ec4162f60743a2b28df40926b
      
      The uninit-value is pol->w.cpuset_mems_allowed in mpol_rebind_policy().
      When syzkaller reproducer runs to the beginning of mpol_new(),
      
      	    mpol_new() mm/mempolicy.c
      	  do_mbind() mm/mempolicy.c
      	kernel_mbind() mm/mempolicy.c
      
      `mode` is 1(MPOL_PREFERRED), nodes_empty(*nodes) is `true` and `flags`
      is 0. Then
      
      	mode = MPOL_LOCAL;
      	...
      	policy->mode = mode;
      	policy->flags = flags;
      
      will be executed. So in mpol_set_nodemask(),
      
      	    mpol_set_nodemask() mm/mempolicy.c
      	  do_mbind()
      	kernel_mbind()
      
      pol->mode is 4 (MPOL_LOCAL), that `nodemask` in `pol` is not initialized,
      which will be accessed in mpol_rebind_policy().
      
      Link: https://lkml.kernel.org/r/20220512123428.fq3wofedp6oiotd4@ppc.localdomainSigned-off-by: NWang Cheng <wanngchenng@gmail.com>
      Reported-by: <syzbot+217f792c92599518a2ab@syzkaller.appspotmail.com>
      Tested-by: <syzbot+217f792c92599518a2ab@syzkaller.appspotmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      7057a3c7
  3. 09 11月, 2022 22 次提交
    • Z
      mm/sharepool: fix the incorrect judgement of the addr range · b1e17d35
      Zhou Guanghui 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
      CVE: NA
      
      --------------------------------
      
      The address range of dvpp is [start, start + size), the value of
      start + size can be out of the address range.
      Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>
      b1e17d35
    • G
      mm/sharepool: Fix sharepool hugepage cgroup uncount error. · 107e2b7c
      Guo Mengqi 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
      CVE: NA
      
      --------------------------------
      
      If current->flag is set as PF_MEMALLOC, memcgroup will not check
      current's allocation against memory use limit, which cause system run
      out of memory.
      
      According to
      https://lkml.indiana.edu/hypermail/linux/kernel/0911.2/00576.html,
      PF_MEMALLOC shall only be used when more memory are sure to be freed as a
      result of this allocation.
      
      Do not use PF_MEMALLOC, rather, remove __GFP_RECLAIM from gfp_mask to
      ensure no reclaim.
      Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
      107e2b7c
    • W
      mm/sharepool: Rebind the numa node when fallback to normal pages · 1343dd93
      Wang Wensheng 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
      CVE: NA
      
      --------------------------------
      
      When we allocate memory using SP_HUGEPAGE, we would try normal pages when
      there was no enough hugepages. The specified numa node information would
      get lost when we fallback to normal pages. The result is that we could
      allocate memory from other numa node than what we have specified.
      
      The soultion is to rebind the node before retrying.
      Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>
      1343dd93
    • Z
      mm/sharepool: Remove the leading double underlines for function name · 95618625
      Zhang Zekun 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
      CVE: NA
      
      ----------------------------------------------
      
      Renaming __insert_sp_area to insert_sp_area.
      Renaming __find_sp_area_locked to find_sp_area_locked.
      
      Fix this by renaming __insert_sp_area to insert_sp_area.
      Signed-off-by: NZhang Zekun <zhangzekun11@huawei.com>
      95618625
    • Z
      mm/sharepool: Fix code-style warnings · c3c8461e
      Zhang Zekun 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
      CVE: NA
      
      -----------------------------------------
      
      1. Remove the inline clause before sp_mapping_find().
      2. Do not declare or define reserved identifiers.
      3. Add brackets in if, elese/elseif statements.
      4. The pointer(*) can't have no spaces neither before nor after it.
      5. Use parentheses to specify the sequence of expressions in
         sp_remap_kva_to_vma(), sp_node_id(), init_local_group().
      6. Besides, change the name of __find_sp_area() to get_sp_area() to
         represent that this function need not to be called with lock hold
         and implicit that this function will increase the use_count.
      Signed-off-by: NZhang Zekun <zhangzekun11@huawei.com>
      c3c8461e
    • G
      mm/sharepool: fix hugepage_rsvd count increase error · bec70574
      Guo Mengqi 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5RO2H
      CVE: NA
      
      --------------------------------
      
      When nr_hugepages is configured, sharepool allocates hugepages first
      from hugetlb pool, then from buddy system if the pool had been used up.
      Current page release function treat the buddy system hugepages as
      hugetlb pages, which caused HugePages_Rsvd to increase improperly.
      
      Add a check in page release function:
          if the page is temporary, do not call hugetlb_unreserve_pages.
      Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
      bec70574
    • G
      mm/sharepool: check size=0 in mg_sp_make_share_k2u() · 564272e8
      Guo Mengqi 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5QQPG
      CVE: NA
      
      --------------------------------
      
      Add a size-0-check in mg_sp_make_share_k2u() to avoid passing 0-size spa
      to __insert_sp_area().
      Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
      564272e8
    • G
      mm/sharepool: fix potential AA deadlock · d9fb53bf
      Guo Mengqi 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5R0X9
      CVE: NA
      
      --------------------------------
      
      Fix a AA deadlock caused by nested lock in mg_sp_group_add_task().
      
      Deadlock path:
      
      mg_sp_group_add_task()
      
          down_write(sp_group_sem)
          find_or_alloc_sp_group()
      	!spg_valid()
      	sp_group_drop()
      	    free_sp_group() -> down_write(sp_group_sem)
          ---> AA deadlock
      Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
      d9fb53bf
    • G
      mm/sharepool: delete unused codes · 872ebaa0
      Guo Mengqi 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5QETC
      CVE: NA
      
      --------------------------------
      
      sp_make_share_k2u only supports vmalloc address now. Therefore, delete a
      backup handle case.
      
      Also master is guaranteed not be freed until master->node_list is emptied.
      Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
      872ebaa0
    • Z
      mm/sharepool: bugfix for 2M U2K · e6a23a8d
      Zhou Guanghui 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5PZDX
      CVE: NA
      
      --------------------------------
      
      We could determine if a userspace map is huge-mapped after walking its
      pagetable. So the uva_align should be calculated again after walking
      the pagetable if it is huge-mapped.
      Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>
      e6a23a8d
    • C
      mm/sharepool: Support alloc ro mapping · d9687e45
      Chen Jun 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
      CVE: NA
      
      --------------------------------
      
      1. Split sharepool normal area(8T) to sharepool readonly area(64G) and
      sharepool normal area(8T - 64G)
      2. User programs can not write to the address in sharepool readonly
         area.
      3. Add SP_PROT_FOCUS for sp_alloc.
      4. sp_alloc with SP_PROT_RO | SP_PROT_FOCUS returns the virtual address
         within sharepool readonly area.
      5. Other user programs which add into task with write prot can not write
      the address in sharepool readonly area.
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      d9687e45
    • C
      mm/sharepool: Extract sp_mapping_find · 60d69023
      Chen Jun 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
      CVE: NA
      
      --------------------------------
      
      Extract code logic of obtaining sp_mapping by address into a function
      sp_mapping_find.
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      60d69023
    • C
      mm/sharepool: replace spg->{dvpp|normal} with spg->mapping[SP_MAPPING_{DVPP|NORMAL}] · 91bc1d52
      Chen Jun 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
      CVE: NA
      
      --------------------------------
      
      spg->dvpp and spg->normal can be combined into one array.
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      91bc1d52
    • C
      mm/sharepool: Rename sp_mapping.flag to sp_mapping.type · ef12ea35
      Chen Jun 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
      CVE: NA
      
      --------------------------------
      
      Now, sp_mapping.flag is only used to distinguish sp_mapping types.
      So, 'type' are more suitable.
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      ef12ea35
    • C
      mm/sharepool: Make the definitions of MMAP_SHARE_POOL_{START|16G_START} more readable · 14cd3fb0
      Chen Jun 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
      CVE: NA
      
      --------------------------------
      
      "TASK_SIZE - MMAP_SHARE_POOL_DVPP_SIZE" is puzzling.
      
      MMAP_SHARE_POOL_START = MMAP_SHARE_POOL_END - MMAP_SHARE_POOL_SIZE and
      MMAP_SHARE_POOL_16G_START = MMAP_SHARE_POOL_END - MMAP_SHARE_POOL_DVPP_SIZE
      make the memory layout not unintuitive.
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>
      14cd3fb0
    • Z
      mm/sharepool: Avoid UAF on mm · a151f824
      Zhou Guanghui 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5PIA6
      CVE: NA
      
      --------------------------------
      
      Use get_task_mm to avoid the mm being released when the
      information in mm_struct is used.
      Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>
      a151f824
    • Z
      mm/sharepool: Check the maximum value of spg_id · 99b7756c
      Zhou Guanghui 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5PIA4
      CVE: NA
      
      --------------------------------
      
      The maximum value of spg_id is checked to ensure that the value
      of spg_id is within the valid range:
      SPG_ID_DEFAULT or [SPG_ID_MIN SPG_ID_AUTO)
      Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>
      99b7756c
    • Z
      mm/sharepool: Avoid UAF on spa · 27d0e771
      Zhou Guanghui 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5PIA0
      CVE: NA
      
      --------------------------------
      
      The spa is used during the update_mem_usage. In this case, the
      spa has been released in the case of concurrency (mg_sp_unshare).
      Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>
      27d0e771
    • Z
      mm/sharepool: delete unnecessary judgment · 142bfed2
      Zhou Guanghui 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5PIA2
      CVE: NA
      
      --------------------------------
      
      When a process is added to a group, mm->mm_users increases by one.
      When a process is deleted from a group, mm->mm_users decreases by
      one. It is not possible to reduce to 0 because this function is
      preceded by get_task_mm.
      Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>
      142bfed2
    • W
      mm/sharepool: Fix UAF reported by KASAN · 19896d2c
      Wang Wensheng 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5PD4P
      CVE: NA
      
      --------------------------------
      
      [ 2058.802818][  T290] BUG: KASAN: use-after-free in get_process_sp_res+0x70/0x134
      [ 2058.810194][  T290] Read of size 8 at addr ffff00088dc6ab28 by task test_debug_loop/290
      [ 2058.820520][  T290] CPU: 5 PID: 290 Comm: test_debug_loop Tainted: G        W  OE     5.10.0+ #2
      [ 2058.829377][  T290] Hardware name: EVB(EP) (DT)
      [ 2058.833982][  T290] Call trace:
      [ 2058.837217][  T290]  dump_backtrace+0x0/0x30c
      [ 2058.841660][  T290]  show_stack+0x20/0x30
      [ 2058.845758][  T290]  dump_stack+0x120/0x1b0
      [ 2058.850028][  T290]  print_address_description.constprop.0+0x2c/0x1fc
      [ 2058.856555][  T290]  __kasan_report+0xfc/0x160
      [ 2058.861086][  T290]  kasan_report+0x44/0xb0
      [ 2058.865356][  T290]  __asan_load8+0x94/0xd0
      [ 2058.869623][  T290]  get_process_sp_res+0x70/0x134
      [ 2058.874501][  T290]  proc_usage_show+0x1ac/0x304
      [ 2058.879208][  T290]  seq_read_iter+0x254/0x750
      [ 2058.883728][  T290]  proc_reg_read_iter+0x100/0x140
      [ 2058.888689][  T290]  new_sync_read+0x1cc/0x2c0
      [ 2058.893215][  T290]  vfs_read+0x1f4/0x250
      [ 2058.897304][  T290]  ksys_read+0xcc/0x170
      [ 2058.901399][  T290]  __arm64_sys_read+0x4c/0x60
      [ 2058.906016][  T290]  el0_svc_common.constprop.0+0xb4/0x2a0
      [ 2058.911584][  T290]  do_el0_svc+0x8c/0xb0
      [ 2058.915677][  T290]  el0_svc+0x20/0x30
      [ 2058.919503][  T290]  el0_sync_handler+0xb0/0xbc
      [ 2058.924114][  T290]  el0_sync+0x180/0x1c0
      [ 2058.928190][  T290]
      [ 2058.930444][  T290] Allocated by task 2176:
      [ 2058.934714][  T290]  kasan_save_stack+0x28/0x60
      [ 2058.939328][  T290]  __kasan_kmalloc.constprop.0+0xc8/0xf0
      [ 2058.944909][  T290]  kasan_kmalloc+0x10/0x20
      [ 2058.949268][  T290]  kmem_cache_alloc_trace+0x128/0xabc
      [ 2058.954577][  T290]  create_spg_node+0x58/0x214
      [ 2058.959188][  T290]  local_group_add_task+0x30/0x14c
      [ 2058.964231][  T290]  init_local_group+0xd0/0x1a0
      [ 2058.968936][  T290]  sp_init_group_master_locked.part.0+0x19c/0x290
      [ 2058.975298][  T290]  mg_sp_group_add_task+0x73c/0xdb0
      [ 2058.980456][  T290]  dev_sp_add_group+0x124/0x2dc [sharepool_dev]
      [ 2058.986647][  T290]  dev_ioctl+0x21c/0x2ec [sharepool_dev]
      [ 2058.992222][  T290]  __arm64_sys_ioctl+0xd8/0x120
      [ 2058.997010][  T290]  el0_svc_common.constprop.0+0xb4/0x2a0
      [ 2059.002572][  T290]  do_el0_svc+0x8c/0xb0
      [ 2059.006662][  T290]  el0_svc+0x20/0x30
      [ 2059.010489][  T290]  el0_sync_handler+0xb0/0xbc
      [ 2059.015101][  T290]  el0_sync+0x180/0x1c0
      [ 2059.019176][  T290]
      [ 2059.021427][  T290] Freed by task 4125:
      [ 2059.025343][  T290]  kasan_save_stack+0x28/0x60
      [ 2059.029949][  T290]  kasan_set_track+0x28/0x40
      [ 2059.034476][  T290]  kasan_set_free_info+0x24/0x50
      [ 2059.039347][  T290]  __kasan_slab_free+0x104/0x1ac
      [ 2059.044227][  T290]  kasan_slab_free+0x14/0x20
      [ 2059.048744][  T290]  kfree+0x164/0xb94
      [ 2059.052576][  T290]  sp_group_post_exit+0xf0/0x980
      [ 2059.057448][  T290]  mmput.part.0+0xb4/0x220
      [ 2059.061790][  T290]  mmput+0x2c/0x40
      [ 2059.065450][  T290]  exit_mm+0x27c/0x3a0
      [ 2059.069450][  T290]  do_exit+0x2a0/0x790
      [ 2059.073448][  T290]  do_group_exit+0x64/0x100
      [ 2059.077884][  T290]  get_signal+0x1fc/0x9fc
      [ 2059.082144][  T290]  do_signal+0x110/0x2cc
      [ 2059.086320][  T290]  do_notify_resume+0x158/0x2b0
      [ 2059.091108][  T290]  work_pending+0xc/0x6d4
      [ 2059.095358][  T290]
      Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>
      19896d2c
    • G
      mm/sharepool: fix deadlock in sp_check_mmap_addr · 78c82ea5
      Guo Mengqi 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5OE1J
      CVE: NA
      
      --------------------------------
      
      Fix a deadlock indicated below:
      
      [  171.669844] Chain exists of:
      [  171.669844]   &mm->mmap_lock --> sp_group_sem --> &spg->rw_lock
      [  171.669844]
      [  171.671469]  Possible unsafe locking scenario:
      [  171.671469]
      [  171.672121]        CPU0                    CPU1
      [  171.672415]        ----                    ----
      [  171.672706]   lock(&spg->rw_lock);
      [  171.673114]                                lock(sp_group_sem);
      [  171.673706]                                lock(&spg->rw_lock);
      [  171.674208]   lock(&mm->mmap_lock);
      [  171.674863]
      [  171.674863]  *** DEADLOCK ***
      
      sharepool use lock in order:
      sp_group_sem --> &spg->rw_lock --> mm->mmap_lock
      However, in sp_check_mmap_addr(), when mm->mmap_lock is held, it
      requested sp_group_sem, which is: mm->mmap_lock --> sp_group_sem.
      This causes ABBA problem.
      
      This happens in:
      
      [  171.642687] the existing dependency chain (in reverse order) is:
      [  171.643745]
      [  171.643745] -> #2 (&spg->rw_lock){++++}-{3:3}:
      [  171.644639]        __lock_acquire+0x6f4/0xc40
      [  171.645189]        lock_acquire+0x2f0/0x3c8
      [  171.645631]        down_read+0x64/0x2d8
      [  171.646075]        proc_usage_by_group+0x50/0x258 (spg->rw_lock)
      [  171.646542]        idr_for_each+0x6c/0xf0
      [  171.647011]        proc_group_usage_show+0x140/0x178
      [  171.647629]        seq_read_iter+0xe4/0x498
      [  171.648217]        proc_reg_read_iter+0xa8/0xe0
      [  171.648776]        new_sync_read+0xfc/0x1a0
      [  171.649002]        vfs_read+0x1ac/0x1c8
      [  171.649217]        ksys_read+0x74/0xf8
      [  171.649596]        __arm64_sys_read+0x24/0x30
      [  171.649934]        el0_svc_common.constprop.0+0x8c/0x270
      [  171.650528]        do_el0_svc+0x34/0xb8
      [  171.651069]        el0_svc+0x1c/0x28
      [  171.651278]        el0_sync_handler+0x8c/0xb0
      [  171.651636]        el0_sync+0x168/0x180
      [  171.652118]
      [  171.652118] -> #1 (sp_group_sem){++++}-{3:3}:
      [  171.652692]        __lock_acquire+0x6f4/0xc40
      [  171.653059]        lock_acquire+0x2f0/0x3c8
      [  171.653303]        down_read+0x64/0x2d8
      [  171.653704]        mg_is_sharepool_addr+0x184/0x340 (&sp_group_sem)
      [  171.654085]        sp_check_mmap_addr+0x64/0x108
      [  171.654668]        arch_get_unmapped_area_topdown+0x9c/0x528
      [  171.655370]        thp_get_unmapped_area+0x54/0x68
      [  171.656170]        get_unmapped_area+0x94/0x160
      [  171.656415]        __do_mmap_mm+0xd4/0x540
      [  171.656629]        do_mmap+0x98/0x648
      [  171.656838]        vm_mmap_pgoff+0xc0/0x188
      [  171.657129]        vm_mmap+0x6c/0x98
      [  171.657619]        elf_map+0xe0/0x118
      [  171.657835]        load_elf_binary+0x4ec/0xfd8
      [  171.658103]        bprm_execve.part.9+0x3ec/0x840
      [  171.658448]        bprm_execve+0x7c/0xb0
      [  171.658919]        kernel_execve+0x18c/0x198
      [  171.659500]        run_init_process+0xf0/0x108
      [  171.660073]        try_to_run_init_process+0x20/0x58
      [  171.660558]        kernel_init+0xcc/0x120
      [  171.660862]        ret_from_fork+0x10/0x18
      [  171.661273]
      [  171.661273] -> #0 (&mm->mmap_lock){++++}-{3:3}:
      [  171.661885]        check_prev_add+0xa4/0xbd8
      [  171.662229]        validate_chain+0xf54/0x14b8
      [  171.662705]        __lock_acquire+0x6f4/0xc40
      [  171.663310]        lock_acquire+0x2f0/0x3c8
      [  171.663658]        down_write+0x60/0x208
      [  171.664179]        mg_sp_alloc+0x24c/0x1150 (mm->mmap_lock)
      [  171.665245]        dev_ioctl+0x1128/0x1fb8 [sharepool_dev]
      [  171.665688]        __arm64_sys_ioctl+0xb0/0xe8
      [  171.666250]        el0_svc_common.constprop.0+0x8c/0x270
      [  171.667255]        do_el0_svc+0x34/0xb8
      [  171.667806]        el0_svc+0x1c/0x28
      [  171.668249]        el0_sync_handler+0x8c/0xb0
      [  171.668661]        el0_sync+0x168/0x180
      Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
      78c82ea5
    • G
      mm/sharepool: fix deadlock in spa_stat_of_mapping_show · 608669b7
      Guo Mengqi 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5OE1J
      CVE: NA
      
      --------------------------------
      
      The mutex protecting spm_dvpp_list has an ABBA deadlock with
      spg->rw_lock. Try add a process to a sharepool group and cat
      /proc/sharepool/spa_stat at the same time will reproduce the
      problem.
      
      Remove spg->rw_lock to avoid this.
      
      [ 1101.013480]INFO: task test:3567 blocked for more than 30 seconds.
      [ 1101.014378]      Tainted: G           OE     5.10.0+ #45
      [ 1101.015707]task:test state:D stack:    0 pid: 3567
      [ 1101.016464]Call trace:
      [ 1101.016736] __switch_to+0xc0/0x128
      [ 1101.017082] __schedule+0x3fc/0x898
      [ 1101.017626] schedule+0x48/0xd8
      [ 1101.017981] schedule_preempt_disabled+0x14/0x20
      [ 1101.018519] __mutex_lock.isra.1+0x160/0x638
      [ 1101.018899] __mutex_lock_slowpath+0x24/0x30
      [ 1101.019291] mutex_lock+0x5c/0x68
      [ 1101.019607] sp_mapping_create+0x118/0x1b0
      [ 1101.019963] sp_init_group_master_locked.part.9+0x10c/0x288
      [ 1101.020356] mg_sp_group_add_task.part.16+0x7dc/0xcd0
      [ 1101.020750] mg_sp_group_add_task+0x54/0xd0
      [ 1101.021120] dev_ioctl+0x360/0x1e20 [sharepool_dev]
      [ 1101.022171] __arm64_sys_ioctl+0xb0/0xe8
      [ 1101.022695] el0_svc_common.constprop.0+0x88/0x268
      [ 1101.023143] do_el0_svc+0x34/0xb8
      [ 1101.023487] el0_svc+0x1c/0x28
      [ 1101.023775] el0_sync_handler+0x8c/0xb0
      [ 1101.024120] el0_sync+0x168/0x180
      Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>
      608669b7