提交 · 003c04ed833e4e402da9e2f051bd21e7f8b686fd · openeuler / Kernel

06 7月, 2022 40 次提交

livepatch: Add klp_module_delete_safety_check · 003c04ed

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add klp_module_delete_safety_check for calltrace check during module
deletion to avoid unsafe resource release.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

003c04ed

livepatch/x86: Add arch_klp_module_check_calltrace · 4b69775b

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add arch_klp_module_check_calltrace to check whether stacks of all tasks
are within the code segment of module.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4b69775b

livepatch/x86: Add do_check_calltrace · a42ae8c0

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

The calltrace check code is independent as do_check_calltrace,
for calltrace check of module.
No functional change.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a42ae8c0

livepatch/powerpc64: Add arch_klp_module_check_calltrace · 2a7c3db6

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add arch_klp_module_check_calltrace to check whether stacks of all tasks
are within the code segment of module.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2a7c3db6

livepatch/powerpc64: Add do_check_calltrace · 8b458d5f

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

The calltrace check code is independent as do_check_calltrace,
for calltrace check of module.
No functional change.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8b458d5f

livepatch/powerpc32: Add arch_klp_module_check_calltrace · 1daef7b0

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add arch_klp_module_check_calltrace to check whether stacks of all tasks
are within the code segment of module.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1daef7b0

livepatch/powerpc32: Add do_check_calltrace · 60843384

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

The calltrace check code is independent as do_check_calltrace,
for calltrace check of module.
No functional change.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

60843384

livepatch/arm: Add arch_klp_module_check_calltrace · 80b86dcf

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add arch_klp_module_check_calltrace to check whether stacks of all tasks
are within the code segment of module.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

80b86dcf

livepatch/arm: Add do_check_calltrace · 23959469

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

The calltrace check code is independent as do_check_calltrace,
for calltrace check of module.
No functional change.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

23959469

livepatch/arm64: Add arch_klp_module_check_calltrace · a47c6c8b

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add arch_klp_module_check_calltrace to check whether stacks of all tasks
are within the code segment of module.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a47c6c8b

livepatch/arm64: Add do_check_calltrace · ca56aaaf

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

The calltrace check code is independent as do_check_calltrace,
for calltrace check of klp module.
No functional change.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ca56aaaf

livepatch/powerpc: Support breakpoint exception optimization · 8229d1ce

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add breakpoint exception optimization support to improve livepatch
success rate for ppc64/ppc32.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8229d1ce

livepatch/powerpc: Change livepatch_create_btamp to a public function · 9a51b374

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

A trampoline needs to be created before adding a breakpoint for PPC64.
Change livepatch_create_btamp to a public function and delete redundant
input parameter "struct module *me".

Fix an issue where the branch stub of livepatch is not created if
address of the modified function is a branch function.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9a51b374

livepatch/arm: Support breakpoint exception optimization · bdbfffd4

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add breakpoint exception optimization support to improve livepatch
success rate for arm.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bdbfffd4

livepatch/arm64: Support breakpoint exception optimization · fcd18e53

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Add breakpoint exception optimization support to improve livepatch
success rate for arm64.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fcd18e53

livepatch: Add arch_klp_init · 8dfbc4e3

由 Yang Jihong 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

For the ARM architecture, need to register the callback function for
processing BRK exception in advance.
Therefore, the architecture-related init interface needs to be provided.
Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8dfbc4e3

livepatch/x86: Support breakpoint exception optimization · a0f76f56

由 Li Huafei 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Implement the arch_klp_{check, add, remove}_breakpoint interface to support
breakpoint exception optimization.
Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a0f76f56

livepatch: Use breakpoint exception to optimize enabling livepatch · b25272c0

由 Li Huafei 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

The commit 86e35fae ("livepatch: checks only if the replaced
instruction is on the stack") optimizes stack checking. However, for
extremely hot functions, the replaced instruction may still be on the
stack, and there is room for further optimization.

By inserting a breakpoint exception instruction at the entry of the
patched old function, we can divert calls from the old function to the
new function. In this way, during stack check, only tasks that have
entered the old function before the breakpoint is inserted need to be
considered. This increases the probability of passing the stack check.

If the stack check fails, we sleep for a period of time and try again,
giving the task entering the old function a chance to run out of the
instruction replacement area.

We first enable the patch using the normal process, that is, do not
insert breakpoints. If the first enable fails and the force flag
KLP_STACK_OPTIMIZE is set for all functions of the patch, then we use
breakpoint exception optimization.
Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b25272c0

livepatch: Traverse klp_func_list by using the rcu interface · 37cf995a

由 Li Huafei 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

klp_find_func_node() is used to traverse the klp_func_list linked list.
Currently, klp_find_func_node() is used only when the klp_mutex lock is
held. In the subsequent submission, we need to access the klp_func_list
linked list in the exception handling process and cannot hold the
klp_mutex lock.

We change the traversal of klp_func_list to use the rcu interface and
perform rcu synchronization when deleting nodes.
Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

37cf995a

livepatch: Delete the duplicate code of klp_compare_address() · 7b8328e5

由 Li Huafei 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Delete the duplicate code of klp_compare_address() in each arch.
Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7b8328e5

livepatch: Narrow the scope of the 'text_mutex' lock · 87544d70

由 Li Huafei 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

Currently, arch_klp_code_modify_{prepare, post_process} is implemented
only in the x86 architecture. It is used to hold the 'text_mutex' lock
before entering the stop_machine and modifying the code, and to release
the lock after exiting the stop_machine. klp_mem_prepare() needs to hold
the 'text_mutex' lock only when saving old instruction code on x86 to
ensure that it holds valid instructions.

Place klp_mem_prepare() before arch_klp_code_modify_prepare() and lock the
save instruction action separately to narrow the 'text_mutex' lock.
Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

87544d70

livepatch: Cleanup klp_mem_prepare() · cceb3fb2

由 Li Huafei 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CJ7X

--------------------------------

When klp_mem_prepare() fails, the requested resources are not cleared.
We'd better clean up each newly requested resource upon error return.
Signed-off-by: NLi Huafei <lihuafei1@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

cceb3fb2

sign-file: Support SM signature · c1ad2f07

由 luhuaxin 提交于 7月 06, 2022

euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5ETJZ
CVE: NA

--------

openeuler openssl now supports SM certificate. The type of key should
be set to EVP_PKEY_SM2 before using.
Signed-off-by: Nluhuaxin <luhuaxin1@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c1ad2f07

mm/kfence: reset PG_slab and memcg_data before freeing __kfence_pool · de2905f5

由 Hyeonggon Yoo 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.18-rc7
commit 2839b099
category: bugfix
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2839b0999c20c9f6bf353849c69370e121e2fa1a

--------------------------------

When kfence fails to initialize kfence pool, it frees the pool.  But it
does not reset memcg_data and PG_slab flag.

Below is a BUG because of this. Let's fix it by resetting memcg_data
and PG_slab flag before free.

[    0.089149] BUG: Bad page state in process swapper/0  pfn:3d8e06
[    0.089149] page:ffffea46cf638180 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3d8e06
[    0.089150] memcg:ffffffff94a475d1
[    0.089150] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff)
[    0.089151] raw: 0017ffffc0000200 ffffea46cf638188 ffffea46cf638188 0000000000000000
[    0.089152] raw: 0000000000000000 0000000000000000 00000000ffffffff ffffffff94a475d1
[    0.089152] page dumped because: page still charged to cgroup
[    0.089153] Modules linked in:
[    0.089153] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G    B   W         5.18.0-rc1+ #965
[    0.089154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[    0.089154] Call Trace:
[    0.089155]  <TASK>
[    0.089155]  dump_stack_lvl+0x49/0x5f
[    0.089157]  dump_stack+0x10/0x12
[    0.089158]  bad_page.cold+0x63/0x94
[    0.089159]  check_free_page_bad+0x66/0x70
[    0.089160]  __free_pages_ok+0x423/0x530
[    0.089161]  __free_pages_core+0x8e/0xa0
[    0.089162]  memblock_free_pages+0x10/0x12
[    0.089164]  memblock_free_late+0x8f/0xb9
[    0.089165]  kfence_init+0x68/0x92
[    0.089166]  start_kernel+0x789/0x992
[    0.089167]  x86_64_start_reservations+0x24/0x26
[    0.089168]  x86_64_start_kernel+0xa9/0xaf
[    0.089170]  secondary_startup_64_no_verify+0xd5/0xdb
[    0.089171]  </TASK>

Link: https://lkml.kernel.org/r/YnPG3pQrqfcgOlVa@hyeyoo
Fixes: 0ce20dd8 ("mm: add Kernel Electric-Fence infrastructure")
Fixes: 8f0b3649 ("mm: kfence: fix objcgs vector allocation")
Signed-off-by: NHyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: NMarco Elver <elver@google.com>
Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Conflicts:
	mm/kfence/core.c
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

de2905f5

mm: kfence: fix objcgs vector allocation · bfdbac2a

由 Muchun Song 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.18-rc1
commit 8f0b3649
category: bugfix
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8f0b36497303487d5a32c75789c77859cc2ee895

--------------------------------

If the kfence object is allocated to be used for objects vector, then
this slot of the pool eventually being occupied permanently since the
vector is never freed.  The solutions could be (1) freeing vector when
the kfence object is freed or (2) allocating all vectors statically.

Since the memory consumption of object vectors is low, it is better to
chose (2) to fix the issue and it is also can reduce overhead of vectors
allocating in the future.

Link: https://lkml.kernel.org/r/20220328132843.16624-1-songmuchun@bytedance.com
Fixes: d3fb45f3 ("mm, kfence: insert KFENCE hooks for SLAB")
Signed-off-by: NMuchun Song <songmuchun@bytedance.com>
Reviewed-by: NMarco Elver <elver@google.com>
Reviewed-by: NRoman Gushchin <roman.gushchin@linux.dev>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	mm/kfence/core.c
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bfdbac2a

mm/kfence: print disabling or re-enabling message · 1346cf98

由 Jackie Liu 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.19-rc1
commit 83d7d04f
category: bugfix
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=83d7d04f9d2ef354858b2a8444aee38e41ec1699

--------------------------------

By printing information, we can friendly prompt the status change
information of kfence by dmesg and record by syslog.

Also, set kfence_enabled to false only when needed.

Link: https://lkml.kernel.org/r/20220518073105.3160335-1-liu.yun@linux.devSigned-off-by: NJackie Liu <liuyun01@kylinos.cn>
Co-developed-by: NMarco Elver <elver@google.com>
Signed-off-by: NMarco Elver <elver@google.com>
Reviewed-by: NMarco Elver <elver@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Conflicts:
	mm/kfence/core.c
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1346cf98

kfence: enable check kfence canary on panic via boot param · d3ce9eb5

由 huangshaobo 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.19-rc1
commit 3c81b3bb
category: bugfix
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c81b3bb0a33e2b555edb8d7eb99a7ae4f17d8bb

--------------------------------

Out-of-bounds accesses that aren't caught by a guard page will result in
corruption of canary memory.  In pathological cases, where an object has
certain alignment requirements, an out-of-bounds access might never be
caught by the guard page.  Such corruptions, however, are only detected on
kfree() normally.  If the bug causes the kernel to panic before kfree(),
KFENCE has no opportunity to report the issue.  Such corruptions may also
indicate failing memory or other faults.

To provide some more information in such cases, add the option to check
canary bytes on panic.  This might help narrow the search for the panic
cause; but, due to only having the allocation stack trace, such reports
are difficult to use to diagnose an issue alone.  In most cases, such
reports are inactionable, and is therefore an opt-in feature (disabled by
default).

[akpm@linux-foundation.org: add __read_mostly, per Marco]
Link: https://lkml.kernel.org/r/20220425022456.44300-1-huangshaobo6@huawei.comSigned-off-by: Nhuangshaobo <huangshaobo6@huawei.com>
Suggested-by: Nchenzefeng <chenzefeng2@huawei.com>
Reviewed-by: NMarco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Cc: Wangbing <wangbing6@huawei.com>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Conflicts:
	mm/kfence/core.c
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d3ce9eb5

kfence: test: try to avoid test_gfpzero trigger rcu_stall · 57ffebbb

由 Peng Liu 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.18-rc1
commit 3cb1c962
category: bugfix
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3cb1c9620eeeb67c614c0732a35861b0b1efdc53

--------------------------------

When CONFIG_KFENCE_NUM_OBJECTS is set to a big number, kfence
kunit-test-case test_gfpzero will eat up nearly all the CPU's resources
and rcu_stall is reported as the following log which is cut from a
physical server.

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu: 	68-....: (14422 ticks this GP) idle=6ce/1/0x4000000000000002
  softirq=592/592 fqs=7500 (t=15004 jiffies g=10677 q=20019)
  Task dump for CPU 68:
  task:kunit_try_catch state:R  running task
  stack:    0 pid: 9728 ppid:     2 flags:0x0000020a
  Call trace:
   dump_backtrace+0x0/0x1e4
   show_stack+0x20/0x2c
   sched_show_task+0x148/0x170
   ...
   rcu_sched_clock_irq+0x70/0x180
   update_process_times+0x68/0xb0
   tick_sched_handle+0x38/0x74
   ...
   gic_handle_irq+0x78/0x2c0
   el1_irq+0xb8/0x140
   kfree+0xd8/0x53c
   test_alloc+0x264/0x310 [kfence_test]
   test_gfpzero+0xf4/0x840 [kfence_test]
   kunit_try_run_case+0x48/0x20c
   kunit_generic_run_threadfn_adapter+0x28/0x34
   kthread+0x108/0x13c
   ret_from_fork+0x10/0x18

To avoid rcu_stall and unacceptable latency, a schedule point is
added to test_gfpzero.

Link: https://lkml.kernel.org/r/20220309083753.1561921-4-liupeng256@huawei.comSigned-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NMarco Elver <elver@google.com>
Tested-by: NBrendan Higgins <brendanhiggins@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Wang Kefeng <wangkefeng.wang@huawei.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

57ffebbb

kunit: fix UAF when run kfence test case test_gfpzero · b4a0cd21

由 Peng Liu 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.18-rc1
commit adf50545
category: bugfix
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=adf505457032c11b79b5a7c277c62ff5d61b17c2

--------------------------------

Patch series "kunit: fix a UAF bug and do some optimization", v2.

This series is to fix UAF (use after free) when running kfence test case
test_gfpzero, which is time costly.  This UAF bug can be easily triggered
by setting CONFIG_KFENCE_NUM_OBJECTS = 65535.  Furthermore, some
optimization for kunit tests has been done.

This patch (of 3):

Kunit will create a new thread to run an actual test case, and the main
process will wait for the completion of the actual test thread until
overtime.  The variable "struct kunit test" has local property in function
kunit_try_catch_run, and will be used in the test case thread.  Task
kunit_try_catch_run will free "struct kunit test" when kunit runs
overtime, but the actual test case is still run and an UAF bug will be
triggered.

The above problem has been both observed in a physical machine and qemu
platform when running kfence kunit tests.  The problem can be triggered
when setting CONFIG_KFENCE_NUM_OBJECTS = 65535.  Under this setting, the
test case test_gfpzero will cost hours and kunit will run to overtime.
The follows show the panic log.

  BUG: unable to handle page fault for address: ffffffff82d882e9

  Call Trace:
   kunit_log_append+0x58/0xd0
   ...
   test_alloc.constprop.0.cold+0x6b/0x8a [kfence_test]
   test_gfpzero.cold+0x61/0x8ab [kfence_test]
   kunit_try_run_case+0x4c/0x70
   kunit_generic_run_threadfn_adapter+0x11/0x20
   kthread+0x166/0x190
   ret_from_fork+0x22/0x30
  Kernel panic - not syncing: Fatal exception
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
  Ubuntu-1.8.2-1ubuntu1 04/01/2014

To solve this problem, the test case thread should be stopped when the
kunit frame runs overtime.  The stop signal will send in function
kunit_try_catch_run, and test_gfpzero will handle it.

Link: https://lkml.kernel.org/r/20220309083753.1561921-1-liupeng256@huawei.com
Link: https://lkml.kernel.org/r/20220309083753.1561921-2-liupeng256@huawei.comSigned-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NMarco Elver <elver@google.com>
Reviewed-by: NBrendan Higgins <brendanhiggins@google.com>
Tested-by: NBrendan Higgins <brendanhiggins@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Wang Kefeng <wangkefeng.wang@huawei.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	mm/kfence/kfence_test.c
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b4a0cd21

arm64: kfence: scale sample_interval to control re-enabling · 8fc8f624

由 Liu Shixin 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7

--------------------------------

KFENCE requires linear map to be mapped at page granularity, this
must done in very early time. To save memory of page table, arm64
only map the pages in KFENCE pool itself at page granularity. Thus,
the kfence pool could not allocated by buddy system.

For the flexibility of KFENCE, scale sample_interval to control whether
support to enable kfence after system startup(re-enabling).
Once sample_interval is set to -1 on arm64, memory for kfence pool
will be allocated from early memory no matter KFENCE is enabled or
not.
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8fc8f624

kfence: make re-enabling KFENCE compatible with dynamic objects · 6cdb25d2

由 Liu Shixin 提交于 7月 06, 2022

hulk inclusion
category: feature
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7

--------------------------------

Since re-enabling KFENCE is supported, make it compatible with
dynamic configuired objects.
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6cdb25d2

kfence: alloc kfence_pool after system startup · 617d1d34

由 Tianchen Ding 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.18-rc1
commit b33f778b
category: feature
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b33f778bba5ef3f76fe6708c611346c1ea03acd4

--------------------------------

Allow enabling KFENCE after system startup by allocating its pool via the
page allocator. This provides the flexibility to enable KFENCE even if it
wasn't enabled at boot time.

Link: https://lkml.kernel.org/r/20220307074516.6920-3-dtcccc@linux.alibaba.comSigned-off-by: NTianchen Ding <dtcccc@linux.alibaba.com>
Reviewed-by: NMarco Elver <elver@google.com>
Tested-by: NPeng Liu <liupeng256@huawei.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	mm/kfence/core.c
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

617d1d34

kfence: allow re-enabling KFENCE after system startup · fbd7f3f8

由 Tianchen Ding 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.18-rc1
commit 698361bc
category: feature
bugzilla: 187071, https://gitee.com/openeuler/kernel/issues/I5DLA7
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=698361bca2d59fd29d46c757163854454df477f1

--------------------------------

Patch series "provide the flexibility to enable KFENCE", v3.

If CONFIG_CONTIG_ALLOC is not supported, we fallback to try
alloc_pages_exact().  Allocating pages in this way has limits about
MAX_ORDER (default 11).  So we will not support allocating kfence pool
after system startup with a large KFENCE_NUM_OBJECTS.

When handling failures in kfence_init_pool_late(), we pair
free_pages_exact() to alloc_pages_exact() for compatibility consideration,
though it actually does the same as free_contig_range().

This patch (of 2):

If once KFENCE is disabled by:
echo 0 > /sys/module/kfence/parameters/sample_interval
KFENCE could never be re-enabled until next rebooting.

Allow re-enabling it by writing a positive num to sample_interval.

Link: https://lkml.kernel.org/r/20220307074516.6920-1-dtcccc@linux.alibaba.com
Link: https://lkml.kernel.org/r/20220307074516.6920-2-dtcccc@linux.alibaba.comSigned-off-by: NTianchen Ding <dtcccc@linux.alibaba.com>
Reviewed-by: NMarco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fbd7f3f8

mm,hwpoison: drop unneeded pcplist draining · 49086a5d

由 Oscar Salvador 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 32409cba
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5E2IG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=32409cba3f66810626c1c15b728c31968d6bfa92

--------------------------------

memory_failure and soft_offline_path paths now drain pcplists by calling
get_hwpoison_page.

memory_failure flags the page as HWPoison before, so that page cannot
longer go into a pcplist, and soft_offline_page only flags a page as
HWPoison if 1) we took the page off a buddy freelist 2) the page was
in-use and we migrated it 3) was a clean pagecache.

Because of that, a page cannot longer be poisoned and be in a pcplist.

Link: https://lkml.kernel.org/r/20201013144447.6706-5-osalvador@suse.deSigned-off-by: NOscar Salvador <osalvador@suse.de>
Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

49086a5d

mm,hwpoison: take free pages off the buddy freelists · d65c8af2

由 Oscar Salvador 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.11-rc1
commit a8b2c2ce
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5E2IG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a8b2c2ce89d4e01062de69b89cafad97cd0fc01b

--------------------------------

The crux of the matter is that historically we left poisoned pages in the
buddy system because we have some checks in place when allocating a page
that are gatekeeper for poisoned pages.  Unfortunately, we do have other
users (e.g: compaction [1]) that scan buddy freelists and try to get a
page from there without checking whether the page is HWPoison.

As I stated already, I think it is fundamentally wrong to keep HWPoison
pages within the buddy systems, checks in place or not.

Let us fix this the same way we did for soft_offline [2], taking the page
off the buddy freelist so it is completely unreachable.

Note that this is fairly simple to trigger, as we only need to poison free
buddy pages (madvise MADV_HWPOISON) and then run some sort of memory
stress system.

Just for a matter of reference, I put a dump_page() in compaction_alloc()
to trigger for HWPoison patches:

    page:0000000012b2982b refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x1d5db
    flags: 0xfffffc0800000(hwpoison)
    raw: 000fffffc0800000 ffffea00007573c8 ffffc90000857de0 0000000000000000
    raw: 0000000000000001 0000000000000000 00000001ffffffff 0000000000000000
    page dumped because: compaction_alloc

    CPU: 4 PID: 123 Comm: kcompactd0 Tainted: G            E     5.9.0-rc2-mm1-1-default+ #5
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
    Call Trace:
     dump_stack+0x6d/0x8b
     compaction_alloc+0xb2/0xc0
     migrate_pages+0x2a6/0x12a0
     compact_zone+0x5eb/0x11c0
     proactive_compact_node+0x89/0xf0
     kcompactd+0x2d0/0x3a0
     kthread+0x118/0x130
     ret_from_fork+0x22/0x30

After that, if e.g: a process faults in the page,  it will get killed
unexpectedly.
Fix it by containing the page immediatelly.

Besides that, two more changes can be noticed:

* MF_DELAYED no longer suits as we are fixing the issue by containing
  the page immediately, so it does no longer rely on the allocation-time
  checks to stop HWPoison to be handed over.
  gain unless it is unpoisoned, so we fixed the situation.
  Because of that, let us use MF_RECOVERED from now on.

* The second block that handles PageBuddy pages is no longer needed:
  We call shake_page and then check whether the page is Buddy
  because shake_page calls drain_all_pages, which sends pcp-pages back to
  the buddy freelists, so we could have a chance to handle free pages.
  Currently, get_hwpoison_page already calls drain_all_pages, and we call
  get_hwpoison_page right before coming here, so we should be on the safe
  side.

[1] https://lore.kernel.org/linux-mm/20190826104144.GA7849@linux/T/#u
[2] https://patchwork.kernel.org/cover/11792607/

[osalvador@suse.de: take the poisoned subpage off the buddy frelists]
  Link: https://lkml.kernel.org/r/20201013144447.6706-4-osalvador@suse.de

Link: https://lkml.kernel.org/r/20201013144447.6706-3-osalvador@suse.deSigned-off-by: NOscar Salvador <osalvador@suse.de>
Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	mm/memory-failure.c
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d65c8af2

mm,hwpoison: drain pcplists before bailing out for non-buddy zero-refcount page · d1474a96

由 Oscar Salvador 提交于 7月 06, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 17e395b6
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5E2IG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=17e395b60f5b3dea204fcae60c7b38e84a00d87a

--------------------------------

Patch series "HWpoison: further fixes and cleanups", v5.

This patchset includes some more fixes and a cleanup.

Patch#2 and patch#3 are both fixes for taking a HWpoison page off a buddy
freelist, since having them there has proved to be bad (see [1] and
pathch#2's commit log).  Patch#3 does the same for hugetlb pages.

[1] https://lkml.org/lkml/2020/9/22/565

This patch (of 4):

A page with 0-refcount and !PageBuddy could perfectly be a pcppage.
Currently, we bail out with an error if we encounter such a page, meaning
that we do not handle pcppages neither from hard-offline nor from
soft-offline path.

Fix this by draining pcplists whenever we find this kind of page and retry
the check again.  It might be that pcplists have been spilled into the
buddy allocator and so we can handle it.

Link: https://lkml.kernel.org/r/20201013144447.6706-1-osalvador@suse.de
Link: https://lkml.kernel.org/r/20201013144447.6706-2-osalvador@suse.deSigned-off-by: NOscar Salvador <osalvador@suse.de>
Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NMa Wupeng <mawupeng1@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d1474a96

smp: fix early_param csdlock_debug boot panic · 7bb64d50

由 Chen Zhongjin 提交于 7月 06, 2022

maillist inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5EU5D?from=project-issue
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=44f2910f800ba58c2276cd96e69c456378e6212a

--------------------------------

csdlock_debug is a early_param to enable csd_lock_wait feature.

It uses static_branch_enable to control which triggers a bug on
booting time. In early_param stage static_branch_enable will call
__page_to_pfn before sparse_init.

This causes panic when CONFIG_SPARSEMEM_VMEMMAP=n on arm64, so
change early_param to __setup to avoid the problem.
Reported-by: NChen jingwen <chenjingwen6@huawei.com>
Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com>
Reviewed-by: NXu Kuohai <xukuohai@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7bb64d50

RDMA/hns: Use hr_reg_read() instead of remaining roce_get_xxx() · b1c77b51

由 Wenpeng Liang 提交于 7月 06, 2022

mainline inclusion
from mainline-for-linus
commit 813c9802
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=813c980294d48362ead5422b056072ed214ca2bf

----------------------------------------------------------------------

To reduce the code size and make the code clearer, replace all
roce_get_xxx() with hr_reg_read() to read the data fields.

Link: https://lore.kernel.org/r/20220512080012.38728-3-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com>
Reviewed-by: NYangyang Li <liyangyang20@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b1c77b51

RDMA/hns: Use hr_reg_xxx() instead of remaining roce_set_xxx() · 0e5a7b2f

由 Wenpeng Liang 提交于 7月 06, 2022

mainline inclusion
from mainline-for-linus
commit 82600b2d
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=82600b2d3cd57428bdb03c66ae67708d3c8f7281

----------------------------------------------------------------------

To reduce the code size and make the code clearer, replace all
roce_set_xxx() with hr_reg_xxx() to write the data fields.

Link: https://lore.kernel.org/r/20220512080012.38728-2-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com>
Reviewed-by: NYangyang Li <liyangyang20@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0e5a7b2f

RDMA/hns: Remove the num_cqc_timer variable · 875934ca

由 Yixing Liu 提交于 7月 06, 2022

mainline inclusion
from mainline-for-linus
commit db5dfbf5
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5CHIG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=db5dfbf5b201df65c1f5332c4d9d5e7c2f42396b

----------------------------------------------------------------------

The bt number of cqc_timer of HIP09 increases compared with that of HIP08.
Therefore, cqc_timer_bt_num and num_cqc_timer do not match. As a result,
the driver may fail to allocate cqc_timer. So the driver needs to uniquely
uses cqc_timer_bt_num to represent the bt number of cqc_timer.

Fixes: 0e40dc2f ("RDMA/hns: Add timer allocation support for hip08")
Link: https://lore.kernel.org/r/20220429093545.58070-1-liangwenpeng@huawei.comSigned-off-by: NYixing Liu <liuyixing1@huawei.com>
Signed-off-by: NWenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com>
Reviewed-by: NYangyang Li <liyangyang20@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

875934ca

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功