提交 · b05721fc91d07290dba910cc35ff5081f7cd8513 · openeuler / Kernel

06 3月, 2023 14 次提交

mm/sharepool: Move spa_num field to sp_group. · b05721fc

由 Xu Qiang 提交于 3月 04, 2023

hulk inclusion
category: other
bugzilla: https://gitee.com/openeuler/kernel/issues/I6ET9W

----------------------------------------------

spa_num is not general information, but differentiated information.
It should not be placed in sp_spg_stat.
Signed-off-by: NXu Qiang <xuqiang36@huawei.com>

b05721fc

mm/sharepool: Delete unused mm in sp_proc_stat. · 7663c791

由 Xu Qiang 提交于 3月 04, 2023

hulk inclusion
category: other
bugzilla: https://gitee.com/openeuler/kernel/issues/I6ET9W

----------------------------------------------
Signed-off-by: NXu Qiang <xuqiang36@huawei.com>

7663c791

mm/sharepool: Delete unused spg_id and hugepage_failures. · 30dd244b

由 Xu Qiang 提交于 3月 04, 2023

hulk inclusion
category: other
bugzilla: https://gitee.com/openeuler/kernel/issues/I6ET9W

----------------------------------------------
Signed-off-by: NXu Qiang <xuqiang36@huawei.com>

30dd244b

mm/sharepool: Modify error message in mg_sp_group_del_task · 20ac7750

由 Wang Wensheng 提交于 3月 04, 2023

hulk inclusion
category: other
bugzilla: https://gitee.com/openeuler/kernel/issues/I6G76L

----------------------------------------------

1. Give more informaton in the error log.
2. No need to limit thre rate.
3. Add a '\n' at the end of the format string.
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

20ac7750

mm/sharepool: Fix null-pointer-deference in sp_free_area · b5780e12

由 Wang Wensheng 提交于 3月 04, 2023

hulk inclusion
category: other
bugzilla: https://gitee.com/openeuler/kernel/issues/I6G76L

----------------------------------------------

When a process is deleted from a group, the process does not apply for
memory from the shared group. Otherwise, the UAF problem occurs. We checked
this, but it didn't do a good job of preventing sp_alloc and del_task
concurrency. The process applies for memory after passing the check, which
violates our requirements and causes problems. The solution is to place the
checked code in the critical area to ensure that no memory can be allocated
after the check is passed.

[ T7596] Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000098
[ T7596] Mem abort info:
[ T7596]   ESR = 0x96000004
[ T7596]   EC = 0x25: DABT (current EL), IL = 32 bits
[ T7596]   SET = 0, FnV = 0
[ T7596]   EA = 0, S1PTW = 0
[ T7596] Data abort info:
[ T7596]   ISV = 0, ISS = 0x00000004
[ T7596]   CM = 0, WnR = 0
[ T7596] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001040a3000
[ T7596] [0000000000000098] pgd=0000000000000000, p4d=0000000000000000
[ T7596] Internal error: Oops: 96000004 [#1] SMP
[ T7596] Modules linked in: sharepool_dev(OE) [last unloaded: demo]
[ T7596] CPU: 1 PID: 7596 Comm: test_sp_group_d Tainted: G OE 5.10.0+ #8
[ T7596] Hardware name: linux,dummy-virt (DT)
[ T7596] pstate: 20000005 (nzCv daif -PAN -UAO -TCO BTYPE=--)
[ T7596] pc : sp_free_area+0x34/0x120
[ T7596] lr : sp_free_area+0x30/0x120
[ T7596] sp : ffff80001c6a3b20
[ T7596] x29: ffff80001c6a3b20 x28: 0000000000000009
[ T7596] x27: 0000000000000000 x26: ffff800011c49d20
[ T7596] x25: ffff0000c227f6c0 x24: 0000000000000008
[ T7596] x23: ffff0000c0cf0ce8 x22: 0000000000000001
[ T7596] x21: ffff0000c4082b30 x20: 0000000000000000
[ T7596] x19: ffff0000c4082b00 x18: 0000000000000000
[ T7596] x17: 0000000000000000 x16: 0000000000000000
[ T7596] x15: 0000000000000000 x14: 0000000000000000
[ T7596] x13: 0000000000000000 x12: ffff0005fffe12c0
[ T7596] x11: 0000000000000008 x10: ffff0005fffe12c0
[ T7596] x9 : ffff8000103eb690 x8 : 0000000000000001
[ T7596] x7 : 0000000000210d00 x6 : 0000000000000000
[ T7596] x5 : ffff8000123edea0 x4 : 0000000000000030
[ T7596] x3 : ffffeff000000000 x2 : 0000eff000000000
[ T7596] x1 : 0000e80000000000 x0 : 0000000000000000
[ T7596] Call trace:
[ T7596]  sp_free_area+0x34/0x120
[ T7596]  __sp_area_drop_locked+0x3c/0x60
[ T7596]  sp_area_drop+0x80/0xbc
[ T7596]  remove_vma+0x54/0x70
[ T7596]  exit_mmap+0x114/0x1d0
[ T7596]  mmput+0x90/0x1ec
[ T7596]  exit_mm+0x1d0/0x2f0
[ T7596]  do_exit+0x180/0x400
[ T7596]  do_group_exit+0x40/0x114
[ T7596]  get_signal+0x1e8/0x720
[ T7596]  do_signal+0x11c/0x1e4
[ T7596]  do_notify_resume+0x15c/0x250
[ T7596]  work_pending+0xc/0x6d8
[ T7596] Code: f9400001 f9402c00 97fff0e5 aa0003f4 (f9404c00)
[ T7596] ---[ end trace 3c8368d77e758ebd ]---
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

b5780e12

mm/sharepool: Simplify sp_unshare_uva() · 10e8409f

由 Wang Wensheng 提交于 3月 04, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I650K6

--------------------------------

The unshare process for k2task can be normalized with k2spg
since there exist a local sp group for k2task.
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

10e8409f

mm/sharepool: Rename sp_group operations · 7325bc5b

由 Wang Wensheng 提交于 3月 04, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I650K6

--------------------------------

Rename sp_group_drop[_locked] to sp_group_put[_locked].
Rename __sp_find_spg[_locked] to sp_group_get[_locked].
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

7325bc5b

mm/sharepool: Simplify sp_make_share_k2u() · 5cd5e8ce

由 Wang Wensheng 提交于 3月 04, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I650K6

--------------------------------

The process for k2task can be normalized with k2spg since there exist a
local sp group for k2task.
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

5cd5e8ce

mm/sharepool: Reorganize create_spg() · 7dd71d62

由 Wang Wensheng 提交于 3月 04, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I650K6

--------------------------------

1. Extract a function that initialize all the members for a newly
   allocated sp_group. Just to decrease the function size.
2. Move the idr_alloc to the end of the function, since we should not
   add an uninitialized sp_group to the global idr.
3. Rename the file for hugetlb map.
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

7dd71d62

mm/sharepool: Add helper for master_list · 9cb23272

由 Xu Qiang 提交于 3月 04, 2023

hulk inclusion
category: other
bugzilla: https://gitee.com/openeuler/kernel/issues/I6HRGK

----------------------------------------------

Add two Helper functions sp_add_group_master and
sp_del_group_master to manipulate master_list.
Signed-off-by: NXu Qiang <xuqiang36@huawei.com>

9cb23272

mm/sharepool: Refactoring proc file interface similar code · 793b069d

由 Xu Qiang 提交于 3月 04, 2023

hulk inclusion
category: other
bugzilla: https://gitee.com/openeuler/kernel/issues/I6HRGK

----------------------------------------------

In spa_overview_show, spg_info_show and spg_overview_show,
there is similar code.

The solution is to extract the difference into the function macro.
Signed-off-by: NXu Qiang <xuqiang36@huawei.com>

793b069d

mm/sharepool: Don't display sharepool statistics in the container · 250a1f57

由 Zhou Guanghui 提交于 3月 04, 2023

ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6HRGK

--------------------------------------------

The sharepool statistics record the sharepool memory information used
by all containers in the system. We do not expect to query the
sharepool memory information applied by processes in other containers
in the container.

Therefore, the sharepool statistics cannot be queried in the container
to solve this problem.
Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>

250a1f57

mm/sharepool: Fix NULL pointer dereference in mg_sp_group_del_task · 3c4cb588

由 Wang Wensheng 提交于 3月 04, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I650K6

--------------------------------

If we delete a task that has not been added to any group from a
specified group, NULL pointer dereference would occur.
[  162.566615] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[  162.567699] Mem abort info:
[  162.567971]   ESR = 0x96000006
[  162.568187]   EC = 0x25: DABT (current EL), IL = 32 bits
[  162.568508]   SET = 0, FnV = 0
[  162.568670]   EA = 0, S1PTW = 0
[  162.568794] Data abort info:
[  162.568906]   ISV = 0, ISS = 0x00000006
[  162.569032]   CM = 0, WnR = 0
[  162.569314] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001029e0000
[  162.569516] [0000000000000008] pgd=00000001026da003, p4d=00000001026da003, pud=0000000102a90003, pmd=0000000000000000
[  162.570346] Internal error: Oops: 96000006 [#1] SMP
[  162.570524] CPU: 0 PID: 880 Comm: test_sp_group_d Tainted: G        W  O      5.10.0+ #1
[  162.570868] Hardware name: linux,dummy-virt (DT)
[  162.571053] pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
[  162.571370] pc : mg_sp_group_del_task+0x164/0x488
[  162.571511] lr : mg_sp_group_del_task+0x158/0x488
[  162.571644] sp : ffff8000127d3ca0
[  162.571749] x29: ffff8000127d3ca0 x28: ffff372281b8c140
[  162.571922] x27: 0000000000000000 x26: ffff372280b261c0
[  162.572090] x25: ffffd075db9a9000 x24: ffffd075db9a90f8
[  162.572259] x23: ffffd075db9a90e0 x22: 0000000000000371
[  162.572425] x21: ffff372280826b00 x20: 0000000000000000
[  162.572592] x19: ffffd075db12b000 x18: 0000000000000000
[  162.572756] x17: 0000000000000000 x16: ffffd075da51e60c
[  162.572923] x15: 0000ffffdcf1a540 x14: 0000000000000000
[  162.573087] x13: 0000000000000000 x12: 0000000000000000
[  162.573250] x11: 0000000000000040 x10: ffffd075db5f1908
[  162.573415] x9 : ffffd075db5f1900 x8 : ffff3722816f54b0
[  162.573579] x7 : 0000000000000000 x6 : 0000000000000000
[  162.573741] x5 : ffff3722816f5488 x4 : 0000000000000000
[  162.573906] x3 : ffff372280b2620c x2 : ffff37228036b4a0
[  162.574069] x1 : 0000000000000000 x0 : ffff372280b261c0
[  162.574239] Call trace:
[  162.574336]  mg_sp_group_del_task+0x164/0x488
[  162.575262]  dev_ioctl+0x10cc/0x2478 [sharepool_dev]
[  162.575443]  __arm64_sys_ioctl+0xb4/0xf0
[  162.575585]  el0_svc_common.constprop.0+0xe4/0x2d4
[  162.575726]  do_el0_svc+0x34/0xa8
[  162.575838]  el0_svc+0x1c/0x28
[  162.575941]  el0_sync_handler+0x90/0xf0
[  162.576060]  el0_sync+0x168/0x180
[  162.576391] Code: 97f4d4bf aa0003fa b4001580 f9420c01 (f8408c20)
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

3c4cb588

mm/sharepool: Fix a double free problem caused by init_local_group · 144c1dd2

由 Chen Jun 提交于 3月 04, 2023

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I64Y5Y

-------------------------------

If local_group_add_task fails in init_local_group. ida free the
same id twice.

init_local_group
  local_group_add_task    // failed
  goto free_spg

free_spg:
  free_sp_group_locked
    free_sp_group_id      // free spg->id
free_spg_id:
  free_new_spg_id         // double free spg->id

To fix it, return before calling free_new_spg_id.
Signed-off-by: NChen Jun <chenjun102@huawei.com>

144c1dd2

15 11月, 2022 4 次提交

sharepool: fix sp_alloc_populate no fallocate bug · 60a48ea1

由 Guo Mengqi 提交于 11月 14, 2022

ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I612UG
CVE: NA

--------------------------------

do_mm_populate() will try allocate physical pages from the start of
required range [start, end), and return error on the first allocation
failure without releasing the pages allocated before.
That means we must release the shared-file range after calling
do_mm_populate().

Remove need_fallocate, and always call sp_fallocate() on the error path
of sp_alloc_mmap_populate().
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>

60a48ea1

mm/sharepool: Fix add group failed with errno 28 · 7086bdba

由 Xu Qiang 提交于 11月 14, 2022

ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I612UG
CVE: NA

--------------------------------

We increase task->mm->mm_users by one when we add the task to a
sharepool group. Correspondingly we should drop the mm_users count when
the task exits. Currently we hijack the mmput function and make it
return early and decrease mm->mm_users by one (just as mmput would do)
if it is not called from a task's exiting process, or we decrease
mm->mm_users by the group number the task was added to. This has two
problems:
1. It makes mmput and sp_group_exit hard to understand.
2. The process of judging if the task (also its mm) is exiting and
   decrease its mm_users count is not atomic. We use this condition:
     mm->mm_users == master->count + MM_WOULD_FREE(1)
   If someone else change the mm->mm_users during those two steps, the
   mm->mm_users would be wrong and mm_struct cannot be released anymore.

Suppose the following process:

        proc1                                        proc2

1)      mmput
          |
          V
2)  enter sp_group_exit and
    'mm->mm_users == master->count + 1' is true
3)        |                                         mmget
          V
4)  decrease mm->mm_users by master->count
          |
          V
5)  enter __mmput and release mm_struct
    if mm->mm_users == 1
6)                                                  mmput

The statistical structure who has the same id of the task would get leaked
together with mm_struct, so the next time we try to create the statistical
structure of the same id, we get a failure.

We fix this by moving sp_group_exit to do_exit() actually where the task is
exiting. We don't need to judge if the task is exiting when someone
calling mmput so there is no chance to change mm_users wrongly.
Signed-off-by: NXu Qiang <xuqiang36@huawei.com>
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

7086bdba

mm: sharepool: Fix static check warning · fab907d0

由 Zhang Zekun 提交于 11月 14, 2022

ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I612UG
CVE: NA

--------------------------------

Fix the following static check warning.
Use parentheses to specify the sequence of expressions, instead of using
the default priority.Should use parenthesis while use bitwise operator.

Fix this by add bracket in the expression.
Signed-off-by: NZhang Zekun <zhangzekun11@huawei.com>

fab907d0

mm/sharepool: Use "tgid" instead of "pid" to find a task · 32c81f1b

由 Zhang Zekun 提交于 11月 14, 2022

ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I612UG
CVE: NA

--------------------------------

To support container scenario, use tgid instead of pid to find a
specific task. In normal cases, "tgid" represent a process in init_pid_ns,
this patch should not introduce problems to existing code.

Rename the input parameter "int pid" to "int tgid" in following
exported interfaces:
1.mg_sp_group_id_by_pid()
2.mg_sp_group_add_task()
3.mg_sp_group_del_task()
4.mg_sp_make_share_k2u()
5.mg_sp_make_share_u2k()
6.mg_sp_config_dvpp_range()

Besides, rename these static function together:
1.__sp_find_spg_locked()
2.__sp_find_spg()

The following function use "current->pid" to find spg, change
"current->pid" to "current->tgid".
1.find_or_alloc_sp_group()
2.sp_alloc_prepare()
3.mg_sp_make_share_k2u()
Signed-off-by: NZhang Zekun <zhangzekun11@huawei.com>

32c81f1b

09 11月, 2022 22 次提交

mm/sharepool: fix the incorrect judgement of the addr range · b1e17d35

由 Zhou Guanghui 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
CVE: NA

--------------------------------

The address range of dvpp is [start, start + size), the value of
start + size can be out of the address range.
Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>

b1e17d35

mm/sharepool: Fix sharepool hugepage cgroup uncount error. · 107e2b7c

由 Guo Mengqi 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
CVE: NA

--------------------------------

If current->flag is set as PF_MEMALLOC, memcgroup will not check
current's allocation against memory use limit, which cause system run
out of memory.

According to
https://lkml.indiana.edu/hypermail/linux/kernel/0911.2/00576.html,
PF_MEMALLOC shall only be used when more memory are sure to be freed as a
result of this allocation.

Do not use PF_MEMALLOC, rather, remove __GFP_RECLAIM from gfp_mask to
ensure no reclaim.
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>

107e2b7c

mm/sharepool: Rebind the numa node when fallback to normal pages · 1343dd93

由 Wang Wensheng 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
CVE: NA

--------------------------------

When we allocate memory using SP_HUGEPAGE, we would try normal pages when
there was no enough hugepages. The specified numa node information would
get lost when we fallback to normal pages. The result is that we could
allocate memory from other numa node than what we have specified.

The soultion is to rebind the node before retrying.
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

1343dd93

mm/sharepool: Remove the leading double underlines for function name · 95618625

由 Zhang Zekun 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
CVE: NA

----------------------------------------------

Renaming __insert_sp_area to insert_sp_area.
Renaming __find_sp_area_locked to find_sp_area_locked.

Fix this by renaming __insert_sp_area to insert_sp_area.
Signed-off-by: NZhang Zekun <zhangzekun11@huawei.com>

95618625

mm/sharepool: Fix code-style warnings · c3c8461e

由 Zhang Zekun 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5XQS4
CVE: NA

-----------------------------------------

1. Remove the inline clause before sp_mapping_find().
2. Do not declare or define reserved identifiers.
3. Add brackets in if, elese/elseif statements.
4. The pointer(*) can't have no spaces neither before nor after it.
5. Use parentheses to specify the sequence of expressions in
   sp_remap_kva_to_vma(), sp_node_id(), init_local_group().
6. Besides, change the name of __find_sp_area() to get_sp_area() to
   represent that this function need not to be called with lock hold
   and implicit that this function will increase the use_count.
Signed-off-by: NZhang Zekun <zhangzekun11@huawei.com>

c3c8461e

mm/sharepool: fix hugepage_rsvd count increase error · bec70574

由 Guo Mengqi 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5RO2H
CVE: NA

--------------------------------

When nr_hugepages is configured, sharepool allocates hugepages first
from hugetlb pool, then from buddy system if the pool had been used up.
Current page release function treat the buddy system hugepages as
hugetlb pages, which caused HugePages_Rsvd to increase improperly.

Add a check in page release function:
    if the page is temporary, do not call hugetlb_unreserve_pages.
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>

bec70574

mm/sharepool: check size=0 in mg_sp_make_share_k2u() · 564272e8

由 Guo Mengqi 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5QQPG
CVE: NA

--------------------------------

Add a size-0-check in mg_sp_make_share_k2u() to avoid passing 0-size spa
to __insert_sp_area().
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>

564272e8

mm/sharepool: fix potential AA deadlock · d9fb53bf

由 Guo Mengqi 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5R0X9
CVE: NA

--------------------------------

Fix a AA deadlock caused by nested lock in mg_sp_group_add_task().

Deadlock path:

mg_sp_group_add_task()

    down_write(sp_group_sem)
    find_or_alloc_sp_group()
	!spg_valid()
	sp_group_drop()
	    free_sp_group() -> down_write(sp_group_sem)
    ---> AA deadlock
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>

d9fb53bf

mm/sharepool: delete unused codes · 872ebaa0

由 Guo Mengqi 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5QETC
CVE: NA

--------------------------------

sp_make_share_k2u only supports vmalloc address now. Therefore, delete a
backup handle case.

Also master is guaranteed not be freed until master->node_list is emptied.
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>

872ebaa0

mm/sharepool: bugfix for 2M U2K · e6a23a8d

由 Zhou Guanghui 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PZDX
CVE: NA

--------------------------------

We could determine if a userspace map is huge-mapped after walking its
pagetable. So the uva_align should be calculated again after walking
the pagetable if it is huge-mapped.
Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>

e6a23a8d

mm/sharepool: Support alloc ro mapping · d9687e45

由 Chen Jun 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
CVE: NA

--------------------------------

1. Split sharepool normal area(8T) to sharepool readonly area(64G) and
sharepool normal area(8T - 64G)
2. User programs can not write to the address in sharepool readonly
   area.
3. Add SP_PROT_FOCUS for sp_alloc.
4. sp_alloc with SP_PROT_RO | SP_PROT_FOCUS returns the virtual address
   within sharepool readonly area.
5. Other user programs which add into task with write prot can not write
the address in sharepool readonly area.
Signed-off-by: NChen Jun <chenjun102@huawei.com>

d9687e45

mm/sharepool: Extract sp_mapping_find · 60d69023

由 Chen Jun 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
CVE: NA

--------------------------------

Extract code logic of obtaining sp_mapping by address into a function
sp_mapping_find.
Signed-off-by: NChen Jun <chenjun102@huawei.com>

60d69023

mm/sharepool: replace spg->{dvpp|normal} with spg->mapping[SP_MAPPING_{DVPP|NORMAL}] · 91bc1d52

由 Chen Jun 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
CVE: NA

--------------------------------

spg->dvpp and spg->normal can be combined into one array.
Signed-off-by: NChen Jun <chenjun102@huawei.com>

91bc1d52

mm/sharepool: Rename sp_mapping.flag to sp_mapping.type · ef12ea35

由 Chen Jun 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
CVE: NA

--------------------------------

Now, sp_mapping.flag is only used to distinguish sp_mapping types.
So, 'type' are more suitable.
Signed-off-by: NChen Jun <chenjun102@huawei.com>

ef12ea35

mm/sharepool: Make the definitions of MMAP_SHARE_POOL_{START|16G_START} more readable · 14cd3fb0

由 Chen Jun 提交于 11月 03, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5I72Q
CVE: NA

--------------------------------

"TASK_SIZE - MMAP_SHARE_POOL_DVPP_SIZE" is puzzling.

MMAP_SHARE_POOL_START = MMAP_SHARE_POOL_END - MMAP_SHARE_POOL_SIZE and
MMAP_SHARE_POOL_16G_START = MMAP_SHARE_POOL_END - MMAP_SHARE_POOL_DVPP_SIZE
make the memory layout not unintuitive.
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

14cd3fb0

mm/sharepool: Avoid UAF on mm · a151f824

由 Zhou Guanghui 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PIA6
CVE: NA

--------------------------------

Use get_task_mm to avoid the mm being released when the
information in mm_struct is used.
Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>

a151f824

mm/sharepool: Check the maximum value of spg_id · 99b7756c

由 Zhou Guanghui 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PIA4
CVE: NA

--------------------------------

The maximum value of spg_id is checked to ensure that the value
of spg_id is within the valid range:
SPG_ID_DEFAULT or [SPG_ID_MIN SPG_ID_AUTO)
Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>

99b7756c

mm/sharepool: Avoid UAF on spa · 27d0e771

由 Zhou Guanghui 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PIA0
CVE: NA

--------------------------------

The spa is used during the update_mem_usage. In this case, the
spa has been released in the case of concurrency (mg_sp_unshare).
Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>

27d0e771

mm/sharepool: delete unnecessary judgment · 142bfed2

由 Zhou Guanghui 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PIA2
CVE: NA

--------------------------------

When a process is added to a group, mm->mm_users increases by one.
When a process is deleted from a group, mm->mm_users decreases by
one. It is not possible to reduce to 0 because this function is
preceded by get_task_mm.
Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com>

142bfed2

mm/sharepool: Fix UAF reported by KASAN · 19896d2c

由 Wang Wensheng 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5PD4P
CVE: NA

--------------------------------

[ 2058.802818][  T290] BUG: KASAN: use-after-free in get_process_sp_res+0x70/0x134
[ 2058.810194][  T290] Read of size 8 at addr ffff00088dc6ab28 by task test_debug_loop/290
[ 2058.820520][  T290] CPU: 5 PID: 290 Comm: test_debug_loop Tainted: G        W  OE     5.10.0+ #2
[ 2058.829377][  T290] Hardware name: EVB(EP) (DT)
[ 2058.833982][  T290] Call trace:
[ 2058.837217][  T290]  dump_backtrace+0x0/0x30c
[ 2058.841660][  T290]  show_stack+0x20/0x30
[ 2058.845758][  T290]  dump_stack+0x120/0x1b0
[ 2058.850028][  T290]  print_address_description.constprop.0+0x2c/0x1fc
[ 2058.856555][  T290]  __kasan_report+0xfc/0x160
[ 2058.861086][  T290]  kasan_report+0x44/0xb0
[ 2058.865356][  T290]  __asan_load8+0x94/0xd0
[ 2058.869623][  T290]  get_process_sp_res+0x70/0x134
[ 2058.874501][  T290]  proc_usage_show+0x1ac/0x304
[ 2058.879208][  T290]  seq_read_iter+0x254/0x750
[ 2058.883728][  T290]  proc_reg_read_iter+0x100/0x140
[ 2058.888689][  T290]  new_sync_read+0x1cc/0x2c0
[ 2058.893215][  T290]  vfs_read+0x1f4/0x250
[ 2058.897304][  T290]  ksys_read+0xcc/0x170
[ 2058.901399][  T290]  __arm64_sys_read+0x4c/0x60
[ 2058.906016][  T290]  el0_svc_common.constprop.0+0xb4/0x2a0
[ 2058.911584][  T290]  do_el0_svc+0x8c/0xb0
[ 2058.915677][  T290]  el0_svc+0x20/0x30
[ 2058.919503][  T290]  el0_sync_handler+0xb0/0xbc
[ 2058.924114][  T290]  el0_sync+0x180/0x1c0
[ 2058.928190][  T290]
[ 2058.930444][  T290] Allocated by task 2176:
[ 2058.934714][  T290]  kasan_save_stack+0x28/0x60
[ 2058.939328][  T290]  __kasan_kmalloc.constprop.0+0xc8/0xf0
[ 2058.944909][  T290]  kasan_kmalloc+0x10/0x20
[ 2058.949268][  T290]  kmem_cache_alloc_trace+0x128/0xabc
[ 2058.954577][  T290]  create_spg_node+0x58/0x214
[ 2058.959188][  T290]  local_group_add_task+0x30/0x14c
[ 2058.964231][  T290]  init_local_group+0xd0/0x1a0
[ 2058.968936][  T290]  sp_init_group_master_locked.part.0+0x19c/0x290
[ 2058.975298][  T290]  mg_sp_group_add_task+0x73c/0xdb0
[ 2058.980456][  T290]  dev_sp_add_group+0x124/0x2dc [sharepool_dev]
[ 2058.986647][  T290]  dev_ioctl+0x21c/0x2ec [sharepool_dev]
[ 2058.992222][  T290]  __arm64_sys_ioctl+0xd8/0x120
[ 2058.997010][  T290]  el0_svc_common.constprop.0+0xb4/0x2a0
[ 2059.002572][  T290]  do_el0_svc+0x8c/0xb0
[ 2059.006662][  T290]  el0_svc+0x20/0x30
[ 2059.010489][  T290]  el0_sync_handler+0xb0/0xbc
[ 2059.015101][  T290]  el0_sync+0x180/0x1c0
[ 2059.019176][  T290]
[ 2059.021427][  T290] Freed by task 4125:
[ 2059.025343][  T290]  kasan_save_stack+0x28/0x60
[ 2059.029949][  T290]  kasan_set_track+0x28/0x40
[ 2059.034476][  T290]  kasan_set_free_info+0x24/0x50
[ 2059.039347][  T290]  __kasan_slab_free+0x104/0x1ac
[ 2059.044227][  T290]  kasan_slab_free+0x14/0x20
[ 2059.048744][  T290]  kfree+0x164/0xb94
[ 2059.052576][  T290]  sp_group_post_exit+0xf0/0x980
[ 2059.057448][  T290]  mmput.part.0+0xb4/0x220
[ 2059.061790][  T290]  mmput+0x2c/0x40
[ 2059.065450][  T290]  exit_mm+0x27c/0x3a0
[ 2059.069450][  T290]  do_exit+0x2a0/0x790
[ 2059.073448][  T290]  do_group_exit+0x64/0x100
[ 2059.077884][  T290]  get_signal+0x1fc/0x9fc
[ 2059.082144][  T290]  do_signal+0x110/0x2cc
[ 2059.086320][  T290]  do_notify_resume+0x158/0x2b0
[ 2059.091108][  T290]  work_pending+0xc/0x6d4
[ 2059.095358][  T290]
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>

19896d2c

mm/sharepool: fix deadlock in sp_check_mmap_addr · 78c82ea5

由 Guo Mengqi 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5OE1J
CVE: NA

--------------------------------

Fix a deadlock indicated below:

[  171.669844] Chain exists of:
[  171.669844]   &mm->mmap_lock --> sp_group_sem --> &spg->rw_lock
[  171.669844]
[  171.671469]  Possible unsafe locking scenario:
[  171.671469]
[  171.672121]        CPU0                    CPU1
[  171.672415]        ----                    ----
[  171.672706]   lock(&spg->rw_lock);
[  171.673114]                                lock(sp_group_sem);
[  171.673706]                                lock(&spg->rw_lock);
[  171.674208]   lock(&mm->mmap_lock);
[  171.674863]
[  171.674863]  *** DEADLOCK ***

sharepool use lock in order:
sp_group_sem --> &spg->rw_lock --> mm->mmap_lock
However, in sp_check_mmap_addr(), when mm->mmap_lock is held, it
requested sp_group_sem, which is: mm->mmap_lock --> sp_group_sem.
This causes ABBA problem.

This happens in:

[  171.642687] the existing dependency chain (in reverse order) is:
[  171.643745]
[  171.643745] -> #2 (&spg->rw_lock){++++}-{3:3}:
[  171.644639]        __lock_acquire+0x6f4/0xc40
[  171.645189]        lock_acquire+0x2f0/0x3c8
[  171.645631]        down_read+0x64/0x2d8
[  171.646075]        proc_usage_by_group+0x50/0x258 (spg->rw_lock)
[  171.646542]        idr_for_each+0x6c/0xf0
[  171.647011]        proc_group_usage_show+0x140/0x178
[  171.647629]        seq_read_iter+0xe4/0x498
[  171.648217]        proc_reg_read_iter+0xa8/0xe0
[  171.648776]        new_sync_read+0xfc/0x1a0
[  171.649002]        vfs_read+0x1ac/0x1c8
[  171.649217]        ksys_read+0x74/0xf8
[  171.649596]        __arm64_sys_read+0x24/0x30
[  171.649934]        el0_svc_common.constprop.0+0x8c/0x270
[  171.650528]        do_el0_svc+0x34/0xb8
[  171.651069]        el0_svc+0x1c/0x28
[  171.651278]        el0_sync_handler+0x8c/0xb0
[  171.651636]        el0_sync+0x168/0x180
[  171.652118]
[  171.652118] -> #1 (sp_group_sem){++++}-{3:3}:
[  171.652692]        __lock_acquire+0x6f4/0xc40
[  171.653059]        lock_acquire+0x2f0/0x3c8
[  171.653303]        down_read+0x64/0x2d8
[  171.653704]        mg_is_sharepool_addr+0x184/0x340 (&sp_group_sem)
[  171.654085]        sp_check_mmap_addr+0x64/0x108
[  171.654668]        arch_get_unmapped_area_topdown+0x9c/0x528
[  171.655370]        thp_get_unmapped_area+0x54/0x68
[  171.656170]        get_unmapped_area+0x94/0x160
[  171.656415]        __do_mmap_mm+0xd4/0x540
[  171.656629]        do_mmap+0x98/0x648
[  171.656838]        vm_mmap_pgoff+0xc0/0x188
[  171.657129]        vm_mmap+0x6c/0x98
[  171.657619]        elf_map+0xe0/0x118
[  171.657835]        load_elf_binary+0x4ec/0xfd8
[  171.658103]        bprm_execve.part.9+0x3ec/0x840
[  171.658448]        bprm_execve+0x7c/0xb0
[  171.658919]        kernel_execve+0x18c/0x198
[  171.659500]        run_init_process+0xf0/0x108
[  171.660073]        try_to_run_init_process+0x20/0x58
[  171.660558]        kernel_init+0xcc/0x120
[  171.660862]        ret_from_fork+0x10/0x18
[  171.661273]
[  171.661273] -> #0 (&mm->mmap_lock){++++}-{3:3}:
[  171.661885]        check_prev_add+0xa4/0xbd8
[  171.662229]        validate_chain+0xf54/0x14b8
[  171.662705]        __lock_acquire+0x6f4/0xc40
[  171.663310]        lock_acquire+0x2f0/0x3c8
[  171.663658]        down_write+0x60/0x208
[  171.664179]        mg_sp_alloc+0x24c/0x1150 (mm->mmap_lock)
[  171.665245]        dev_ioctl+0x1128/0x1fb8 [sharepool_dev]
[  171.665688]        __arm64_sys_ioctl+0xb0/0xe8
[  171.666250]        el0_svc_common.constprop.0+0x8c/0x270
[  171.667255]        do_el0_svc+0x34/0xb8
[  171.667806]        el0_svc+0x1c/0x28
[  171.668249]        el0_sync_handler+0x8c/0xb0
[  171.668661]        el0_sync+0x168/0x180
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>

78c82ea5

mm/sharepool: fix deadlock in spa_stat_of_mapping_show · 608669b7

由 Guo Mengqi 提交于 11月 03, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5OE1J
CVE: NA

--------------------------------

The mutex protecting spm_dvpp_list has an ABBA deadlock with
spg->rw_lock. Try add a process to a sharepool group and cat
/proc/sharepool/spa_stat at the same time will reproduce the
problem.

Remove spg->rw_lock to avoid this.

[ 1101.013480]INFO: task test:3567 blocked for more than 30 seconds.
[ 1101.014378]      Tainted: G           OE     5.10.0+ #45
[ 1101.015707]task:test state:D stack:    0 pid: 3567
[ 1101.016464]Call trace:
[ 1101.016736] __switch_to+0xc0/0x128
[ 1101.017082] __schedule+0x3fc/0x898
[ 1101.017626] schedule+0x48/0xd8
[ 1101.017981] schedule_preempt_disabled+0x14/0x20
[ 1101.018519] __mutex_lock.isra.1+0x160/0x638
[ 1101.018899] __mutex_lock_slowpath+0x24/0x30
[ 1101.019291] mutex_lock+0x5c/0x68
[ 1101.019607] sp_mapping_create+0x118/0x1b0
[ 1101.019963] sp_init_group_master_locked.part.9+0x10c/0x288
[ 1101.020356] mg_sp_group_add_task.part.16+0x7dc/0xcd0
[ 1101.020750] mg_sp_group_add_task+0x54/0xd0
[ 1101.021120] dev_ioctl+0x360/0x1e20 [sharepool_dev]
[ 1101.022171] __arm64_sys_ioctl+0xb0/0xe8
[ 1101.022695] el0_svc_common.constprop.0+0x88/0x268
[ 1101.023143] do_el0_svc+0x34/0xb8
[ 1101.023487] el0_svc+0x1c/0x28
[ 1101.023775] el0_sync_handler+0x8c/0xb0
[ 1101.024120] el0_sync+0x168/0x180
Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com>

608669b7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功