提交 · ecb923f0920eac832435d5e39b880c858d04c271 · openeuler / Kernel

03 7月, 2021 18 次提交

scsi: remove unused kobj map for sd devie to avoid memleak · ecb923f0

由 Yufen Yu 提交于 7月 01, 2021

hulk inclusion
category: bugfix
bugzilla: 168625
CVE: NA

-------------------------------------------------

After calling add_disk, we have register new kobj map for sd device,
then we can remove old unused kobj map which probed by sd_remove.
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ecb923f0

scsi: libiscsi: Add helper to calculate max SCSI cmds per session · 809e429c

由 Mike Christie 提交于 6月 28, 2021

mainline inclusion
from mainline-v5.12-rc1
commit b4046922
category: bugfix
bugzilla: 107448
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b4046922b3c0740ad50a6e9c59e12f4dc43946d4

-----------------------------------------------

This patch just breaks out the code that calculates the number of SCSI cmds
that will be used for a SCSI session. It also adds a check that we don't go
over the host's can_queue value.

Link: https://lore.kernel.org/r/20210207044608.27585-6-michael.christie@oracle.comReviewed-by: NLee Duncan <lduncan@suse.com>
Signed-off-by: NMike Christie <michael.christie@oracle.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

809e429c

scsi: libiscsi: Drop taskqueuelock · 41b8b943

由 Mike Christie 提交于 6月 28, 2021

mainline inclusion
from mainline-v5.12-rc1
commit 5923d64b
category: bugfix
bugzilla: 107448
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5923d64b7ab63dcc6f0df946098f50902f9540d1

-----------------------------------------------

The purpose of the taskqueuelock was to handle the issue where a bad target
decides to send a R2T and before its data has been sent decides to send a
cmd response to complete the cmd. The following patches fix up the
frwd/back locks so they are taken from the queue/xmit (frwd) and completion
(back) paths again. To get there this patch removes the taskqueuelock which
for iSCSI xmit wq based drivers was taken in the queue, xmit and completion
paths.

Instead of the lock, we just make sure we have a ref to the task when we
queue a R2T, and then we always remove the task from the requeue list in
the xmit path or the forced cleanup paths.

Link: https://lore.kernel.org/r/20210207044608.27585-3-michael.christie@oracle.comReviewed-by: NLee Duncan <lduncan@suse.com>
Signed-off-by: NMike Christie <michael.christie@oracle.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NYe Bin <yebin10@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

41b8b943

mm/zswap: add the flag can_sleep_mapped · 1a1d0a85

由 Tian Tao 提交于 6月 17, 2021

mainline inclusion
from mainline-5.12-rc1
commit fc6697a8
category: bugfix
bugzilla: 107205
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fc6697a89f56d9773b2fbff718d4cf2a6d63379d

-------------------------------------------------

Patch series "Fix the compatibility of zsmalloc and zswap".

Patch #1 adds a flag to zpool, then zswap used to determine if zpool
drivers such as zbud/z3fold/zsmalloc will enter an atomic context after
mapping.

The difference between zbud/z3fold and zsmalloc is that zsmalloc requires
an atomic context that since its map function holds a preempt-disabled,
but zbud/z3fold don't require an atomic context.  So patch #2 sets flag
sleep_mapped to true indicating that zbud/z3fold can sleep after mapping.
zsmalloc didn't support sleep after mapping, so don't set that flag to
true.

This patch (of 2):

Add a flag to zpool, named is "can_sleep_mapped", and have it set true for
zbud/z3fold, not set this flag for zsmalloc, so its default value is
false.  Then zswap could go the current path if the flag is true; and if
it's false, copy data from src to a temporary buffer, then unmap the
handle, take the mutex, process the buffer instead of src to avoid
sleeping function called from atomic context.

[natechancellor@gmail.com: add return value in zswap_frontswap_load]
  Link: https://lkml.kernel.org/r/20210121214804.926843-1-natechancellor@gmail.com
[tiantao6@hisilicon.com: fix potential memory leak]
  Link: https://lkml.kernel.org/r/1611538365-51811-1-git-send-email-tiantao6@hisilicon.com
[colin.king@canonical.com: fix potential uninitialized pointer read on tmp]
  Link: https://lkml.kernel.org/r/20210128141728.639030-1-colin.king@canonical.com
[tiantao6@hisilicon.com: fix variable 'entry' is uninitialized when used]
  Link: https://lkml.kernel.org/r/1611223030-58346-1-git-send-email-tiantao6@hisilicon.comLink: https://lkml.kernel.org/r/1611035683-12732-1-git-send-email-tiantao6@hisilicon.com

Link: https://lkml.kernel.org/r/1611035683-12732-2-git-send-email-tiantao6@hisilicon.comSigned-off-by: NTian Tao <tiantao6@hisilicon.com>
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com>
Acked-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Reported-by: NMike Galbraith <efault@gmx.de>
Cc: Barry Song <song.bao.hua@hisilicon.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Huang <chenhuang5@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1a1d0a85

bpf: Fix NULL pointer dereference in bpf_get_local_storage() helper · a9c81664

由 Yonghong Song 提交于 6月 17, 2021

mainline inclusion
from mainline-v5.13-rc1
commit b910eaaa
category: bugfix
bugzilla: 106537
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b910eaaaa4b89976ef02e5d6448f3f73dc671d91

-------------------------------------------------

Jiri Olsa reported a bug ([1]) in kernel where cgroup local
storage pointer may be NULL in bpf_get_local_storage() helper.
There are two issues uncovered by this bug:
  (1). kprobe or tracepoint prog incorrectly sets cgroup local storage
       before prog run,
  (2). due to change from preempt_disable to migrate_disable,
       preemption is possible and percpu storage might be overwritten
       by other tasks.

This issue (1) is fixed in [2]. This patch tried to address issue (2).
The following shows how things can go wrong:
  task 1:   bpf_cgroup_storage_set() for percpu local storage
         preemption happens
  task 2:   bpf_cgroup_storage_set() for percpu local storage
         preemption happens
  task 1:   run bpf program

task 1 will effectively use the percpu local storage setting by task 2
which will be either NULL or incorrect ones.

Instead of just one common local storage per cpu, this patch fixed
the issue by permitting 8 local storages per cpu and each local
storage is identified by a task_struct pointer. This way, we
allow at most 8 nested preemption between bpf_cgroup_storage_set()
and bpf_cgroup_storage_unset(). The percpu local storage slot
is released (calling bpf_cgroup_storage_unset()) by the same task
after bpf program finished running.
bpf_test_run() is also fixed to use the new bpf_cgroup_storage_set()
interface.

The patch is tested on top of [2] with reproducer in [1].
Without this patch, kernel will emit error in 2-3 minutes.
With this patch, after one hour, still no error.

 [1] https://lore.kernel.org/bpf/CAKH8qBuXCfUz=w8L+Fj74OaUpbosO29niYwTki7e3Ag044_aww@mail.gmail.com/T
 [2] https://lore.kernel.org/bpf/20210309185028.3763817-1-yhs@fb.comSigned-off-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NRoman Gushchin <guro@fb.com>
Link: https://lore.kernel.org/bpf/20210323055146.3334476-1-yhs@fb.com

Conflicts:
    include/linux/bpf.h
    net/bpf/test_run.c
Signed-off-by: NPu Lehui <pulehui@huawei.com>
Reviewed-by: NKuohai Xu <xukuohai@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a9c81664

gpu: host1x: Split up client initalization and registration · f18fdb2b

由 Thierry Reding 提交于 6月 22, 2021

stable inclusion
from stable-5.10.45
commit 9c1d492baa9128b970524b9baa4b9baa7dc01c64
bugzilla: 109305
CVE: NA

--------------------------------

[ Upstream commit 0cfe5a6e ]

In some cases we may need to initialize the host1x client first before
registering it. This commit adds a new helper that will do nothing but
the initialization of the data structure.

At the same time, the initialization is removed from the registration
function. Note, however, that for simplicity we explicitly initialize
the client when the host1x_client_register() function is called, as
opposed to the low-level __host1x_client_register() function. This
allows existing callers to remain unchanged.
Signed-off-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f18fdb2b

HID: usbhid: fix info leak in hid_submit_ctrl · 2e859113

由 Anirudh Rayabharam 提交于 6月 22, 2021

stable inclusion
from stable-5.10.45
commit b1e3596416d74ce95cc0b7b38472329a3818f8a9
bugzilla: 109305
CVE: NA

--------------------------------

[ Upstream commit 6be388f4 ]

In hid_submit_ctrl(), the way of calculating the report length doesn't
take into account that report->size can be zero. When running the
syzkaller reproducer, a report of size 0 causes hid_submit_ctrl) to
calculate transfer_buffer_length as 16384. When this urb is passed to
the usb core layer, KMSAN reports an info leak of 16384 bytes.

To fix this, first modify hid_report_len() to account for the zero
report size case by using DIV_ROUND_UP for the division. Then, call it
from hid_submit_ctrl().

Reported-by: syzbot+7c2bb71996f95a82524c@syzkaller.appspotmail.com
Signed-off-by: NAnirudh Rayabharam <mail@anirudhrb.com>
Acked-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2e859113

HID: hid-input: add mapping for emoji picker key · 258c4f94

由 Dmitry Torokhov 提交于 6月 22, 2021

stable inclusion
from stable-5.10.45
commit 0bd8a4b46cdb46541044b784a515a0f6613002ee
bugzilla: 109305
CVE: NA

--------------------------------

[ Upstream commit 7b229b13 ]

HUTRR101 added a new usage code for a key that is supposed to invoke and
dismiss an emoji picker widget to assist users to locate and enter emojis.

This patch adds a new key definition KEY_EMOJI_PICKER and maps 0x0c/0x0d9
usage code to this new keycode. Additionally hid-debug is adjusted to
recognize this new usage code as well.
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

258c4f94

livepatch: fix unload hook could not be excuted · 9c222ac0

由 Ye Weihua 提交于 6月 23, 2021

hulk inclusion
category: bugfix
bugzilla: 110621
CVE: NA

--------------------------------

Livepatch can add some hook functions when inserting and disabling the
patch. The hook functions called during inserting is named load hooks,
and the hook functions called during disabling is named unload hooks.

During the test, it is found that unload hooks is not executed. The
reason is that the __klp_free_objects() is called before
klp_free_patch_finish() is executed. This function deletes obj from the
patch list. Therefore, klp_for_each_object in klp_free_patch_finish()
cannot fund obj. As a result, the klp_unload_hook() is not executed.
Signed-off-by: NYe Weihua <yeweihua4@huawei.com>
Reviewed-by: NKuohai Xu <xukuohai@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9c222ac0

iommu: sva: Fix compile error in iommu_sva_bind_group · 27abd90f

由 Lijun Fang 提交于 6月 22, 2021

hulk inclusion
category: bugfix
bugzilla: 51855
CVE: NA

---------------------------------------------

iommu_sva_bind_group should return pointer rather than an integer,
so it will cause compile error, if not defined CONFIG_IOMMU_API

Fixes: 45e52e3fc545 ("iommu: Add group variant for SVA bind/unbind")
Signed-off-by: NLijun Fang <fanglijun3@huawei.com>
Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

27abd90f

kvm: fix previous commit for 32-bit builds · 688b677d

由 Paolo Bonzini 提交于 6月 18, 2021

stable inclusion
from stable-5.10.44
commit 9064c9d544b906a63e359db5f908594bc07580e6
bugzilla: 109295
CVE: NA

--------------------------------

commit 4422829e upstream.

array_index_nospec does not work for uint64_t on 32-bit builds.
However, the size of a memory slot must be less than 20 bits wide
on those system, since the memory slot must fit in the user
address space.  So just store it in an unsigned long.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

688b677d

sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling · ba635552

由 Dietmar Eggemann 提交于 6月 18, 2021

stable inclusion
from stable-5.10.44
commit 190a7f908993cc7f5e8bbe67aa88eeb0bdbbbaa5
bugzilla: 109295
CVE: NA

--------------------------------

commit 68d7a190 upstream.

The util_est internal UTIL_AVG_UNCHANGED flag which is used to prevent
unnecessary util_est updates uses the LSB of util_est.enqueued. It is
exposed via _task_util_est() (and task_util_est()).

Commit 92a801e5 ("sched/fair: Mask UTIL_AVG_UNCHANGED usages")
mentions that the LSB is lost for util_est resolution but
find_energy_efficient_cpu() checks if task_util_est() returns 0 to
return prev_cpu early.

_task_util_est() returns the max value of util_est.ewma and
util_est.enqueued or'ed w/ UTIL_AVG_UNCHANGED.
So task_util_est() returning the max of task_util() and
_task_util_est() will never return 0 under the default
SCHED_FEAT(UTIL_EST, true).

To fix this use the MSB of util_est.enqueued instead and keep the flag
util_est internal, i.e. don't export it via _task_util_est().

The maximal possible util_avg value for a task is 1024 so the MSB of
'unsigned int util_est.enqueued' isn't used to store a util value.

As a caveat the code behind the util_est_se trace point has to filter
UTIL_AVG_UNCHANGED to see the real util_est.enqueued value which should
be easy to do.

This also fixes an issue report by Xuewen Yan that util_est_update()
only used UTIL_AVG_UNCHANGED for the subtrahend of the equation:

  last_enqueued_diff = ue.enqueued - (task_util() | UTIL_AVG_UNCHANGED)

Fixes: b89997aa sched/pelt: Fix task util_est update filtering
Signed-off-by: NDietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NXuewen Yan <xuewen.yan@unisoc.com>
Reviewed-by: NVincent Donnefort <vincent.donnefort@arm.com>
Reviewed-by: NVincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20210602145808.1562603-1-dietmar.eggemann@arm.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ba635552

vmlinux.lds.h: Avoid orphan section with !SMP · 304b267e

由 Nathan Chancellor 提交于 6月 18, 2021

stable inclusion
from stable-5.10.44
commit 16ccdcdfe668896587b3d4cb2fd6dd512b308dea
bugzilla: 109295
CVE: NA

--------------------------------

commit d4c63999 upstream.

With x86_64_defconfig and the following configs, there is an orphan
section warning:

CONFIG_SMP=n
CONFIG_AMD_MEM_ENCRYPT=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_KVM=y
CONFIG_PARAVIRT=y

ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/cpu/vmware.o' being placed in section `.data..decrypted'
ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/kvm.o' being placed in section `.data..decrypted'

These sections are created with DEFINE_PER_CPU_DECRYPTED, which
ultimately turns into __PCPU_ATTRS, which in turn has a section
attribute with a value of PER_CPU_BASE_SECTION + the section name. When
CONFIG_SMP is not set, the base section is .data and that is not
currently handled in any linker script.

Add .data..decrypted to PERCPU_DECRYPTED_SECTION, which is included in
PERCPU_INPUT -> PERCPU_SECTION, which is include in the x86 linker
script when either CONFIG_X86_64 or CONFIG_SMP is unset, taking care of
the warning.

Fixes: ac26963a ("percpu: Introduce DEFINE_PER_CPU_DECRYPTED")
Link: https://github.com/ClangBuiltLinux/linux/issues/1360Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NNathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com> # build
Signed-off-by: NKees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210506001410.1026691-1-nathan@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

304b267e

RDMA/mlx4: Do not map the core_clock page to user space unless enabled · e4f418ff

由 Shay Drory 提交于 6月 18, 2021

stable inclusion
from stable-5.10.44
commit cb1aa1da04882d1860f733e24aeebdbbc85724d7
bugzilla: 109295
CVE: NA

--------------------------------

commit 404e5a12 upstream.

Currently when mlx4 maps the hca_core_clock page to the user space there
are read-modifiable registers, one of which is semaphore, on this page as
well as the clock counter. If user reads the wrong offset, it can modify
the semaphore and hang the device.

Do not map the hca_core_clock page to the user space unless the device has
been put in a backwards compatibility mode to support this feature.

After this patch, mlx4 core_clock won't be mapped to user space on the
majority of existing devices and the uverbs device time feature in
ibv_query_rt_values_ex() will be disabled.

Fixes: 52033cfb ("IB/mlx4: Add mmap call to map the hardware clock")
Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.comSigned-off-by: NShay Drory <shayd@nvidia.com>
Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e4f418ff

regulator: bd71828: Fix .n_voltages settings · e304bce2

由 Axel Lin 提交于 6月 18, 2021

stable inclusion
from stable-5.10.44
commit 4579f65176792a54e842d28605330ef0f4916df2
bugzilla: 109295
CVE: NA

--------------------------------

commit 4c668630 upstream.

Current .n_voltages settings do not cover the latest 2 valid selectors,
so it fails to set voltage for the hightest voltage support.
The latest linear range has step_uV = 0, so it does not matter if we
count the .n_voltages to maximum selector + 1 or the first selector of
latest linear range + 1.
To simplify calculating the n_voltages, let's just set the
.n_voltages to maximum selector + 1.

Fixes: 522498f8 ("regulator: bd71828: Basic support for ROHM bd71828 PMIC regulators")
Signed-off-by: NAxel Lin <axel.lin@ingics.com>
Reviewed-by: NMatti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Link: https://lore.kernel.org/r/20210523071045.2168904-2-axel.lin@ingics.comSigned-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e304bce2

usb: pd: Set PD_T_SINK_WAIT_CAP to 310ms · c43985c2

由 Kyle Tso 提交于 6月 18, 2021

stable inclusion
from stable-5.10.44
commit b452e8bb7c525fbecc69aef44b0721d2ece032bc
bugzilla: 109295
CVE: NA

--------------------------------

commit 6490fa56 upstream.

Current timer PD_T_SINK_WAIT_CAP is set to 240ms which will violate the
SinkWaitCapTimer (tTypeCSinkWaitCap 310 - 620 ms) defined in the PD
Spec if the port is faster enough when running the state machine. Set it
to the lower bound 310ms to ensure the timeout is in Spec.

Fixes: f0690a25 ("staging: typec: USB Type-C Port Manager (tcpm)")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NKyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210528081613.730661-1-kyletso@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c43985c2

kvm: avoid speculation-based attacks from out-of-range memslot accesses · 0f841f09

由 Paolo Bonzini 提交于 6月 18, 2021

stable inclusion
from stable-5.10.44
commit 7af299b97734c7e7f465b42a2139ce4d77246975
bugzilla: 109295
CVE: NA

--------------------------------

commit da27a83f upstream.

KVM's mechanism for accessing guest memory translates a guest physical
address (gpa) to a host virtual address using the right-shifted gpa
(also known as gfn) and a struct kvm_memory_slot.  The translation is
performed in __gfn_to_hva_memslot using the following formula:

      hva = slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE

It is expected that gfn falls within the boundaries of the guest's
physical memory.  However, a guest can access invalid physical addresses
in such a way that the gfn is invalid.

__gfn_to_hva_memslot is called from kvm_vcpu_gfn_to_hva_prot, which first
retrieves a memslot through __gfn_to_memslot.  While __gfn_to_memslot
does check that the gfn falls within the boundaries of the guest's
physical memory or not, a CPU can speculate the result of the check and
continue execution speculatively using an illegal gfn. The speculation
can result in calculating an out-of-bounds hva.  If the resulting host
virtual address is used to load another guest physical address, this
is effectively a Spectre gadget consisting of two consecutive reads,
the second of which is data dependent on the first.

Right now it's not clear if there are any cases in which this is
exploitable.  One interesting case was reported by the original author
of this patch, and involves visiting guest page tables on x86.  Right
now these are not vulnerable because the hva read goes through get_user(),
which contains an LFENCE speculation barrier.  However, there are
patches in progress for x86 uaccess.h to mask kernel addresses instead of
using LFENCE; once these land, a guest could use speculation to read
from the VMM's ring 3 address space.  Other architectures such as ARM
already use the address masking method, and would be susceptible to
this same kind of data-dependent access gadgets.  Therefore, this patch
proactively protects from these attacks by masking out-of-bounds gfns
in __gfn_to_hva_memslot, which blocks speculation of invalid hvas.

Sean Christopherson noted that this patch does not cover
kvm_read_guest_offset_cached.  This however is limited to a few bytes
past the end of the cache, and therefore it is unlikely to be useful in
the context of building a chain of data dependent accesses.
Reported-by: NArtemiy Margaritov <artemiy.margaritov@gmail.com>
Co-developed-by: NArtemiy Margaritov <artemiy.margaritov@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0f841f09

scsi: iscsi: Fix race condition between login and sync thread · 367d3770

由 Gulam Mohamed 提交于 6月 11, 2021

mainline inclusion
from mainline-v5.12-rc6
commit 9e67600e
category: bugfix
bugzilla: 107093
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9e67600ed6b8565da4b85698ec659b5879a6c1c6

-----------------------------------------------

A kernel panic was observed due to a timing issue between the sync thread
and the initiator processing a login response from the target. The session
reopen can be invoked both from the session sync thread when iscsid
restarts and from iscsid through the error handler. Before the initiator
receives the response to a login, another reopen request can be sent from
the error handler/sync session. When the initial login response is
subsequently processed, the connection has been closed and the socket has
been released.

To fix this a new connection state, ISCSI_CONN_BOUND, is added:

 - Set the connection state value to ISCSI_CONN_DOWN upon
   iscsi_if_ep_disconnect() and iscsi_if_stop_conn()

 - Set the connection state to the newly created value ISCSI_CONN_BOUND
   after bind connection (transport->bind_conn())

 - In iscsi_set_param(), return -ENOTCONN if the connection state is not
   either ISCSI_CONN_BOUND or ISCSI_CONN_UP

Link: https://lore.kernel.org/r/20210325093248.284678-1-gulam.mohamed@oracle.comReviewed-by: NMike Christie <michael.christie@oracle.com>
Signed-off-by: NGulam Mohamed <gulam.mohamed@oracle.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

index 91074fd97f64..f4bf62b007a0 100644
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: NKuohai Xu <xukuohai@huawei.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

367d3770

15 6月, 2021 17 次提交

net: caif: add proper error handling · b31be42a

由 Pavel Skripkin 提交于 6月 15, 2021

stable inclusion
from stable-5.10.43
commit d6db727457dd29938524f04b301c83ac67cccb87
bugzilla: 109284
CVE: NA

--------------------------------

commit a2805dca upstream.

caif_enroll_dev() can fail in some cases. Ingnoring
these cases can lead to memory leak due to not assigning
link_support pointer to anywhere.

Fixes: 7c18d220 ("caif: Restructure how link caif link layer enroll")
Cc: stable@vger.kernel.org
Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b31be42a

net: caif: added cfserl_release function · 4c29ce11

由 Pavel Skripkin 提交于 6月 15, 2021

stable inclusion
from stable-5.10.43
commit dac53568c6ac4c4678e6c2f4e66314b99ec85f03
bugzilla: 109284
CVE: NA

--------------------------------

commit bce130e7 upstream.

Added cfserl_release() function.

Cc: stable@vger.kernel.org
Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4c29ce11

bus: ti-sysc: Fix am335x resume hang for usb otg module · 9f181b28

由 Tony Lindgren 提交于 6月 15, 2021

stable inclusion
from stable-5.10.43
commit d551b8e857775a6ea48f365d9611fe5c470008a3
bugzilla: 109284
CVE: NA

--------------------------------

[ Upstream commit 4d7b324e ]

On am335x, suspend and resume only works once, and the system hangs if
suspend is attempted again. However, turns out suspend and resume works
fine multiple times if the USB OTG driver for musb controller is loaded.

The issue is caused my the interconnect target module losing context
during suspend, and it needs a restore on resume to be reconfigure again
as debugged earlier by Dave Gerlach <d-gerlach@ti.com>.

There are also other modules that need a restore on resume, like gpmc as
noted by Dave. So let's add a common way to restore an interconnect
target module based on a quirk flag. For now, let's enable the quirk for
am335x otg only to fix the suspend and resume issue.

As gpmc is not causing hangs based on tests with BeagleBone, let's patch
gpmc separately. For gpmc, we also need a hardware reset done before
restore according to Dave.

To reinit the modules, we decouple system suspend from PM runtime. We
replace calls to pm_runtime_force_suspend() and pm_runtime_force_resume()
with direct calls to internal functions and rely on the driver internal
state. There no point trying to handle complex system suspend and resume
quirks via PM runtime.

This is issue should have already been noticed with commit 1819ef2e
("bus: ti-sysc: Use swsup quirks also for am335x musb") when quirk
handling was added for am335x otg for swsup. But the issue went unnoticed
as having musb driver loaded hides the issue, and suspend and resume works
once without the driver loaded.

Fixes: 1819ef2e ("bus: ti-sysc: Use swsup quirks also for am335x musb")
Suggested-by: NDave Gerlach <d-gerlach@ti.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9f181b28

net/mlx5: DR, Create multi-destination flow table with level less than 64 · c912d5a8

由 Yevgeny Kliteynik 提交于 6月 15, 2021

stable inclusion
from stable-5.10.43
commit 2a8cda3867cd06fbc3f414a78e1c692f973d21e4
bugzilla: 109284
CVE: NA

--------------------------------

[ Upstream commit 216214c6 ]

Flow table that contains flow pointing to multiple flow tables or multiple
TIRs must have a level lower than 64. In our case it applies to muli-
destination flow table.
Fix the level of the created table to comply with HW Spec definitions, and
still make sure that its level lower than SW-owned tables, so that it
would be possible to point from the multi-destination FW table to SW
tables.

Fixes: 34583bee ("net/mlx5: DR, Create multi-destination table for SW-steering use")
Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: NAlex Vesker <valex@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c912d5a8

net/tls: Fix use-after-free after the TLS device goes down and up · aa3905c0

由 Maxim Mikityanskiy 提交于 6月 15, 2021

stable inclusion
from stable-5.10.43
commit f1d4184f128dede82a59a841658ed40d4e6d3aa2
bugzilla: 109284
CVE: NA

--------------------------------

[ Upstream commit c55dcdd4 ]

When a netdev with active TLS offload goes down, tls_device_down is
called to stop the offload and tear down the TLS context. However, the
socket stays alive, and it still points to the TLS context, which is now
deallocated. If a netdev goes up, while the connection is still active,
and the data flow resumes after a number of TCP retransmissions, it will
lead to a use-after-free of the TLS context.

This commit addresses this bug by keeping the context alive until its
normal destruction, and implements the necessary fallbacks, so that the
connection can resume in software (non-offloaded) kTLS mode.

On the TX side tls_sw_fallback is used to encrypt all packets. The RX
side already has all the necessary fallbacks, because receiving
non-decrypted packets is supported. The thing needed on the RX side is
to block resync requests, which are normally produced after receiving
non-decrypted packets.

The necessary synchronization is implemented for a graceful teardown:
first the fallbacks are deployed, then the driver resources are released
(it used to be possible to have a tls_dev_resync after tls_dev_del).

A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
mode. It's used to skip the RX resync logic completely, as it becomes
useless, and some objects may be released (for example, resync_async,
which is allocated and freed by the driver).

Fixes: e8f69799 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

aa3905c0

net/tls: Replace TLS_RX_SYNC_RUNNING with RCU · ea55ff3c

由 Maxim Mikityanskiy 提交于 6月 15, 2021

stable inclusion
from stable-5.10.43
commit 874ece252ed269f5ac1f55167a3f2735ab0f249f
bugzilla: 109284
CVE: NA

--------------------------------

[ Upstream commit 05fc8b6c ]

RCU synchronization is guaranteed to finish in finite time, unlike a
busy loop that polls a flag. This patch is a preparation for the bugfix
in the next patch, where the same synchronize_net() call will also be
used to sync with the TX datapath.
Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ea55ff3c

net: usb: cdc_ncm: don't spew notifications · 5146c41f

由 Grant Grundler 提交于 6月 15, 2021

stable inclusion
from stable-5.10.43
commit 70df000fb8808e2ea63e6beb2ff570c083592614
bugzilla: 109284
CVE: NA

--------------------------------

[ Upstream commit de658a19 ]

RTL8156 sends notifications about every 32ms.
Only display/log notifications when something changes.

This issue has been reported by others:
	https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832472
	https://lkml.org/lkml/2020/8/27/1083

...
[785962.779840] usb 1-1: new high-speed USB device number 5 using xhci_hcd
[785962.929944] usb 1-1: New USB device found, idVendor=0bda, idProduct=8156, bcdDevice=30.00
[785962.929949] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[785962.929952] usb 1-1: Product: USB 10/100/1G/2.5G LAN
[785962.929954] usb 1-1: Manufacturer: Realtek
[785962.929956] usb 1-1: SerialNumber: 000000001
[785962.991755] usbcore: registered new interface driver cdc_ether
[785963.017068] cdc_ncm 1-1:2.0: MAC-Address: 00:24:27:88:08:15
[785963.017072] cdc_ncm 1-1:2.0: setting rx_max = 16384
[785963.017169] cdc_ncm 1-1:2.0: setting tx_max = 16384
[785963.017682] cdc_ncm 1-1:2.0 usb0: register 'cdc_ncm' at usb-0000:00:14.0-1, CDC NCM, 00:24:27:88:08:15
[785963.019211] usbcore: registered new interface driver cdc_ncm
[785963.023856] usbcore: registered new interface driver cdc_wdm
[785963.025461] usbcore: registered new interface driver cdc_mbim
[785963.038824] cdc_ncm 1-1:2.0 enx002427880815: renamed from usb0
[785963.089586] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
[785963.121673] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
[785963.153682] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
...

This is about 2KB per second and will overwrite all contents of a 1MB
dmesg buffer in under 10 minutes rendering them useless for debugging
many kernel problems.

This is also an extra 180 MB/day in /var/logs (or 1GB per week) rendering
the majority of those logs useless too.

When the link is up (expected state), spew amount is >2x higher:
...
[786139.600992] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.632997] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
[786139.665097] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.697100] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
[786139.729094] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.761108] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
...

Chrome OS cannot support RTL8156 until this is fixed.
Signed-off-by: NGrant Grundler <grundler@chromium.org>
Reviewed-by: NHayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20210120011208.3768105-1-grundler@chromium.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5146c41f

SUNRPC: More fixes for backlog congestion · b73c5591

由 Trond Myklebust 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit 899b5131e74cd09ddd204addf763deaf6870ab57
bugzilla: 55093
CVE: NA

--------------------------------

commit e86be3a0 upstream.

Ensure that we fix the XPRT_CONGESTED starvation issue for RDMA as well
as socket based transports.
Ensure we always initialise the request after waking up from the backlog
list.

Fixes: e877a88d ("SUNRPC in case of backlog, hand free slots directly to waiting task")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b73c5591

net: zero-initialize tc skb extension on allocation · 97bc331b

由 Vlad Buslov 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit ac493452e937b8939eaf2d24cac51a4804b6c20e
bugzilla: 55093
CVE: NA

--------------------------------

[ Upstream commit 9453d45e ]

Function skb_ext_add() doesn't initialize created skb extension with any
value and leaves it up to the user. However, since extension of type
TC_SKB_EXT originally contained only single value tc_skb_ext->chain its
users used to just assign the chain value without setting whole extension
memory to zero first. This assumption changed when TC_SKB_EXT extension was
extended with additional fields but not all users were updated to
initialize the new fields which leads to use of uninitialized memory
afterwards. UBSAN log:

[  778.299821] UBSAN: invalid-load in net/openvswitch/flow.c:899:28
[  778.301495] load of value 107 is not a valid value for type '_Bool'
[  778.303215] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc7+ #2
[  778.304933] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[  778.307901] Call Trace:
[  778.308680]  <IRQ>
[  778.309358]  dump_stack+0xbb/0x107
[  778.310307]  ubsan_epilogue+0x5/0x40
[  778.311167]  __ubsan_handle_load_invalid_value.cold+0x43/0x48
[  778.312454]  ? memset+0x20/0x40
[  778.313230]  ovs_flow_key_extract.cold+0xf/0x14 [openvswitch]
[  778.314532]  ovs_vport_receive+0x19e/0x2e0 [openvswitch]
[  778.315749]  ? ovs_vport_find_upcall_portid+0x330/0x330 [openvswitch]
[  778.317188]  ? create_prof_cpu_mask+0x20/0x20
[  778.318220]  ? arch_stack_walk+0x82/0xf0
[  778.319153]  ? secondary_startup_64_no_verify+0xb0/0xbb
[  778.320399]  ? stack_trace_save+0x91/0xc0
[  778.321362]  ? stack_trace_consume_entry+0x160/0x160
[  778.322517]  ? lock_release+0x52e/0x760
[  778.323444]  netdev_frame_hook+0x323/0x610 [openvswitch]
[  778.324668]  ? ovs_netdev_get_vport+0xe0/0xe0 [openvswitch]
[  778.325950]  __netif_receive_skb_core+0x771/0x2db0
[  778.327067]  ? lock_downgrade+0x6e0/0x6f0
[  778.328021]  ? lock_acquire+0x565/0x720
[  778.328940]  ? generic_xdp_tx+0x4f0/0x4f0
[  778.329902]  ? inet_gro_receive+0x2a7/0x10a0
[  778.330914]  ? lock_downgrade+0x6f0/0x6f0
[  778.331867]  ? udp4_gro_receive+0x4c4/0x13e0
[  778.332876]  ? lock_release+0x52e/0x760
[  778.333808]  ? dev_gro_receive+0xcc8/0x2380
[  778.334810]  ? lock_downgrade+0x6f0/0x6f0
[  778.335769]  __netif_receive_skb_list_core+0x295/0x820
[  778.336955]  ? process_backlog+0x780/0x780
[  778.337941]  ? mlx5e_rep_tc_netdevice_event_unregister+0x20/0x20 [mlx5_core]
[  778.339613]  ? seqcount_lockdep_reader_access.constprop.0+0xa7/0xc0
[  778.341033]  ? kvm_clock_get_cycles+0x14/0x20
[  778.342072]  netif_receive_skb_list_internal+0x5f5/0xcb0
[  778.343288]  ? __kasan_kmalloc+0x7a/0x90
[  778.344234]  ? mlx5e_handle_rx_cqe_mpwrq+0x9e0/0x9e0 [mlx5_core]
[  778.345676]  ? mlx5e_xmit_xdp_frame_mpwqe+0x14d0/0x14d0 [mlx5_core]
[  778.347140]  ? __netif_receive_skb_list_core+0x820/0x820
[  778.348351]  ? mlx5e_post_rx_mpwqes+0xa6/0x25d0 [mlx5_core]
[  778.349688]  ? napi_gro_flush+0x26c/0x3c0
[  778.350641]  napi_complete_done+0x188/0x6b0
[  778.351627]  mlx5e_napi_poll+0x373/0x1b80 [mlx5_core]
[  778.352853]  __napi_poll+0x9f/0x510
[  778.353704]  ? mlx5_flow_namespace_set_mode+0x260/0x260 [mlx5_core]
[  778.355158]  net_rx_action+0x34c/0xa40
[  778.356060]  ? napi_threaded_poll+0x3d0/0x3d0
[  778.357083]  ? sched_clock_cpu+0x18/0x190
[  778.358041]  ? __common_interrupt+0x8e/0x1a0
[  778.359045]  __do_softirq+0x1ce/0x984
[  778.359938]  __irq_exit_rcu+0x137/0x1d0
[  778.360865]  irq_exit_rcu+0xa/0x20
[  778.361708]  common_interrupt+0x80/0xa0
[  778.362640]  </IRQ>
[  778.363212]  asm_common_interrupt+0x1e/0x40
[  778.364204] RIP: 0010:native_safe_halt+0xe/0x10
[  778.365273] Code: 4f ff ff ff 4c 89 e7 e8 50 3f 40 fe e9 dc fe ff ff 48 89 df e8 43 3f 40 fe eb 90 cc e9 07 00 00 00 0f 00 2d 74 05 62 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 64 05 62 00 f4 c3 cc cc 0f 1f 44 00
[  778.369355] RSP: 0018:ffffffff84407e48 EFLAGS: 00000246
[  778.370570] RAX: ffff88842de46a80 RBX: ffffffff84425840 RCX: ffffffff83418468
[  778.372143] RDX: 000000000026f1da RSI: 0000000000000004 RDI: ffffffff8343af5e
[  778.373722] RBP: fffffbfff0884b08 R08: 0000000000000000 R09: ffff88842de46bcb
[  778.375292] R10: ffffed1085bc8d79 R11: 0000000000000001 R12: 0000000000000000
[  778.376860] R13: ffffffff851124a0 R14: 0000000000000000 R15: dffffc0000000000
[  778.378491]  ? rcu_eqs_enter.constprop.0+0xb8/0xe0
[  778.379606]  ? default_idle_call+0x5e/0xe0
[  778.380578]  default_idle+0xa/0x10
[  778.381406]  default_idle_call+0x96/0xe0
[  778.382350]  do_idle+0x3d4/0x550
[  778.383153]  ? arch_cpu_idle_exit+0x40/0x40
[  778.384143]  cpu_startup_entry+0x19/0x20
[  778.385078]  start_kernel+0x3c7/0x3e5
[  778.385978]  secondary_startup_64_no_verify+0xb0/0xbb

Fix the issue by providing new function tc_skb_ext_alloc() that allocates
tc skb extension and initializes its memory to 0 before returning it to the
caller. Change all existing users to use new API instead of calling
skb_ext_add() directly.

Fixes: 038ebb1a ("net/sched: act_ct: fix miss set mru for ovs after defrag in act_ct")
Fixes: d29334c1 ("net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct")
Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
Acked-by: NCong Wang <cong.wang@bytedance.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

97bc331b

net: sched: fix tx action rescheduling issue during deactivation · 006771a8

由 Yunsheng Lin 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit 2f23d5bcd9f89c239da83abd6270f5f0d9dd95bc
bugzilla: 55093
CVE: NA

--------------------------------

[ Upstream commit 102b55ee ]

Currently qdisc_run() checks the STATE_DEACTIVATED of lockless
qdisc before calling __qdisc_run(), which ultimately clear the
STATE_MISSED when all the skb is dequeued. If STATE_DEACTIVATED
is set before clearing STATE_MISSED, there may be rescheduling
of net_tx_action() at the end of qdisc_run_end(), see below:

CPU0(net_tx_atcion)  CPU1(__dev_xmit_skb)  CPU2(dev_deactivate)
          .                   .                     .
          .            set STATE_MISSED             .
          .           __netif_schedule()            .
          .                   .           set STATE_DEACTIVATED
          .                   .                qdisc_reset()
          .                   .                     .
          .<---------------   .              synchronize_net()
clear __QDISC_STATE_SCHED  |  .                     .
          .                |  .                     .
          .                |  .            some_qdisc_is_busy()
          .                |  .               return *false*
          .                |  .                     .
  test STATE_DEACTIVATED   |  .                     .
__qdisc_run() *not* called |  .                     .
          .                |  .                     .
   test STATE_MISS         |  .                     .
 __netif_schedule()--------|  .                     .
          .                   .                     .
          .                   .                     .

__qdisc_run() is not called by net_tx_atcion() in CPU0 because
CPU2 has set STATE_DEACTIVATED flag during dev_deactivate(), and
STATE_MISSED is only cleared in __qdisc_run(), __netif_schedule
is called at the end of qdisc_run_end(), causing tx action
rescheduling problem.

qdisc_run() called by net_tx_action() runs in the softirq context,
which should has the same semantic as the qdisc_run() called by
__dev_xmit_skb() protected by rcu_read_lock_bh(). And there is a
synchronize_net() between STATE_DEACTIVATED flag being set and
qdisc_reset()/some_qdisc_is_busy in dev_deactivate(), we can safely
bail out for the deactived lockless qdisc in net_tx_action(), and
qdisc_reset() will reset all skb not dequeued yet.

So add the rcu_read_lock() explicitly to protect the qdisc_run()
and do the STATE_DEACTIVATED checking in net_tx_action() before
calling qdisc_run_begin(). Another option is to do the checking in
the qdisc_run_end(), but it will add unnecessary overhead for
non-tx_action case, because __dev_queue_xmit() will not see qdisc
with STATE_DEACTIVATED after synchronize_net(), the qdisc with
STATE_DEACTIVATED can only be seen by net_tx_action() because of
__netif_schedule().

The STATE_DEACTIVATED checking in qdisc_run() is to avoid race
between net_tx_action() and qdisc_reset(), see:
commit d518d2ed ("net/sched: fix race between deactivation
and dequeue for NOLOCK qdisc"). As the bailout added above for
deactived lockless qdisc in net_tx_action() provides better
protection for the race without calling qdisc_run() at all, so
remove the STATE_DEACTIVATED checking in qdisc_run().

After qdisc_reset(), there is no skb in qdisc to be dequeued, so
clear the STATE_MISSED in dev_reset_queue() too.

Fixes: 6b3ba914 ("net: sched: allow qdiscs to handle locking")
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com>
V8: Clearing STATE_MISSED before calling __netif_schedule() has
    avoid the endless rescheduling problem, but there may still
    be a unnecessary rescheduling, so adjust the commit log.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

006771a8

net: sched: fix packet stuck problem for lockless qdisc · e3ee623d

由 Yunsheng Lin 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit 21c71510925308a1d81513e1519165c063d1b57c
bugzilla: 55093
CVE: NA

--------------------------------

[ Upstream commit a90c57f2 ]

Lockless qdisc has below concurrent problem:
    cpu0                 cpu1
     .                     .
q->enqueue                 .
     .                     .
qdisc_run_begin()          .
     .                     .
dequeue_skb()              .
     .                     .
sch_direct_xmit()          .
     .                     .
     .                q->enqueue
     .             qdisc_run_begin()
     .            return and do nothing
     .                     .
qdisc_run_end()            .

cpu1 enqueue a skb without calling __qdisc_run() because cpu0
has not released the lock yet and spin_trylock() return false
for cpu1 in qdisc_run_begin(), and cpu0 do not see the skb
enqueued by cpu1 when calling dequeue_skb() because cpu1 may
enqueue the skb after cpu0 calling dequeue_skb() and before
cpu0 calling qdisc_run_end().

Lockless qdisc has below another concurrent problem when
tx_action is involved:

cpu0(serving tx_action)     cpu1             cpu2
          .                   .                .
          .              q->enqueue            .
          .            qdisc_run_begin()       .
          .              dequeue_skb()         .
          .                   .            q->enqueue
          .                   .                .
          .             sch_direct_xmit()      .
          .                   .         qdisc_run_begin()
          .                   .       return and do nothing
          .                   .                .
 clear __QDISC_STATE_SCHED    .                .
 qdisc_run_begin()            .                .
 return and do nothing        .                .
          .                   .                .
          .            qdisc_run_end()         .

This patch fixes the above data race by:
1. If the first spin_trylock() return false and STATE_MISSED is
   not set, set STATE_MISSED and retry another spin_trylock() in
   case other CPU may not see STATE_MISSED after it releases the
   lock.
2. reschedule if STATE_MISSED is set after the lock is released
   at the end of qdisc_run_end().

For tx_action case, STATE_MISSED is also set when cpu1 is at the
end if qdisc_run_end(), so tx_action will be rescheduled again
to dequeue the skb enqueued by cpu2.

Clear STATE_MISSED before retrying a dequeuing when dequeuing
returns NULL in order to reduce the overhead of the second
spin_trylock() and __netif_schedule() calling.

Also clear the STATE_MISSED before calling __netif_schedule()
at the end of qdisc_run_end() to avoid doing another round of
dequeuing in the pfifo_fast_dequeue().

The performance impact of this patch, tested using pktgen and
dummy netdev with pfifo_fast qdisc attached:

 threads  without+this_patch   with+this_patch      delta
    1        2.61Mpps            2.60Mpps           -0.3%
    2        3.97Mpps            3.82Mpps           -3.7%
    4        5.62Mpps            5.59Mpps           -0.5%
    8        2.78Mpps            2.77Mpps           -0.3%
   16        2.22Mpps            2.22Mpps           -0.0%

Fixes: 6b3ba914 ("net: sched: allow qdiscs to handle locking")
Acked-by: NJakub Kicinski <kuba@kernel.org>
Tested-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e3ee623d

net: really orphan skbs tied to closing sk · 62127e1d

由 Paolo Abeni 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit 1f1b431a4fcd96a6be85ab5a61bd874960d182cf
bugzilla: 55093
CVE: NA

--------------------------------

[ Upstream commit 098116e7 ]

If the owing socket is shutting down - e.g. the sock reference
count already dropped to 0 and only sk_wmem_alloc is keeping
the sock alive, skb_orphan_partial() becomes a no-op.

When forwarding packets over veth with GRO enabled, the above
causes refcount errors.

This change addresses the issue with a plain skb_orphan() call
in the critical scenario.

Fixes: 9adc89af ("net: let skb_orphan_partial wake-up waiters.")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

62127e1d

linux/bits.h: fix compilation error with GENMASK · cffb222b

由 Rikard Falkeborn 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit 1354ec840899e87259286cc844d4c161ea86fae7
bugzilla: 55093
CVE: NA

--------------------------------

[ Upstream commit f747e666 ]

GENMASK() has an input check which uses __builtin_choose_expr() to
enable a compile time sanity check of its inputs if they are known at
compile time.

However, it turns out that __builtin_constant_p() does not always return
a compile time constant [0].  It was thought this problem was fixed with
gcc 4.9 [1], but apparently this is not the case [2].

Switch to use __is_constexpr() instead which always returns a compile time
constant, regardless of its inputs.

Link: https://lore.kernel.org/lkml/42b4342b-aefc-a16a-0d43-9f9c0d63ba7a@rasmusvillemoes.dk [0]
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19449 [1]
Link: https://lore.kernel.org/lkml/1ac7bbc2-45d9-26ed-0b33-bf382b8d858b@I-love.SAKURA.ne.jp [2]
Link: https://lkml.kernel.org/r/20210511203716.117010-1-rikard.falkeborn@gmail.comSigned-off-by: NRikard Falkeborn <rikard.falkeborn@gmail.com>
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

cffb222b

netfilter: flowtable: Remove redundant hw refresh bit · f85c5227

由 Roi Dayan 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit 6d6bc8c75290866e59ee25ab6bb0114eb166b980
bugzilla: 55093
CVE: NA

--------------------------------

commit c07531c0 upstream.

Offloading conns could fail for multiple reasons and a hw refresh bit is
set to try to reoffload it in next sw packet.
But it could be in some cases and future points that the hw refresh bit
is not set but a refresh could succeed.
Remove the hw refresh bit and do offload refresh if requested.
There won't be a new work entry if a work is already pending
anyway as there is the hw pending bit.

Fixes: 8b3646d6 ("net/sched: act_ct: Support refreshing the flow table entries")
Signed-off-by: NRoi Dayan <roid@nvidia.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f85c5227

{net,vdpa}/mlx5: Configure interface MAC into mpfs L2 table · b4bb871d

由 Eli Cohen 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit 89a0e388c6f2ea73434727ca4780150874e225a7
bugzilla: 55093
CVE: NA

--------------------------------

commit 7c9f131f upstream.

net/mlx5: Expose MPFS configuration API

MPFS is the multi physical function switch that bridges traffic between
the physical port and any physical functions associated with it. The
driver is required to add or remove MAC entries to properly forward
incoming traffic to the correct physical function.

We export the API to control MPFS so that other drivers, such as
mlx5_vdpa are able to add MAC addresses of their network interfaces.

The MAC address of the vdpa interface must be configured into the MPFS L2
address. Failing to do so could cause, in some NIC configurations, failure
to forward packets to the vdpa network device instance.

Fix this by adding calls to update the MPFS table.

CC: <mst@redhat.com>
CC: <jasowang@redhat.com>
CC: <virtualization@lists.linux-foundation.org>
Fixes: 1a86b377 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: NEli Cohen <elic@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b4bb871d

drivers: base: Fix device link removal · b9dbe96f

由 Rafael J. Wysocki 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit d007150b4e15bfcb8d36cfd88a5645d42e44d383
bugzilla: 55093
CVE: NA

--------------------------------

commit 80dd33cf upstream.

When device_link_free() drops references to the supplier and
consumer devices of the device link going away and the reference
being dropped turns out to be the last one for any of those
device objects, its ->release callback will be invoked and it
may sleep which goes against the SRCU callback execution
requirements.

To address this issue, make the device link removal code carry out
the device_link_free() actions preceded by SRCU synchronization from
a separate work item (the "long" workqueue is used for that, because
it does not matter when the device link memory is released and it may
take time to get to that point) instead of using SRCU callbacks.

While at it, make the code work analogously when SRCU is not enabled
to reduce the differences between the SRCU and non-SRCU cases.

Fixes: 843e600b ("driver core: Fix sleeping in invalid context during device link deletion")
Cc: stable <stable@vger.kernel.org>
Reported-by: Nchenxiang (M) <chenxiang66@hisilicon.com>
Tested-by: Nchenxiang (M) <chenxiang66@hisilicon.com>
Reviewed-by: NSaravana Kannan <saravanak@google.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/5722787.lOV4Wx5bFT@kreacherSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b9dbe96f

mac80211: properly handle A-MSDUs that start with an RFC 1042 header · 79a5f094

由 Mathy Vanhoef 提交于 6月 07, 2021

stable inclusion
from stable-5.10.42
commit e3561d5af01c2c49ed52c8d2644be752d5b13ec2
bugzilla: 55093
CVE: NA

--------------------------------

commit a1d5ff56 upstream.

Properly parse A-MSDUs whose first 6 bytes happen to equal a rfc1042
header. This can occur in practice when the destination MAC address
equals AA:AA:03:00:00:00. More importantly, this simplifies the next
patch to mitigate A-MSDU injection attacks.

Cc: stable@vger.kernel.org
Signed-off-by: NMathy Vanhoef <Mathy.Vanhoef@kuleuven.be>
Link: https://lore.kernel.org/r/20210511200110.0b2b886492f0.I23dd5d685fe16d3b0ec8106e8f01b59f499dffed@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

79a5f094

04 6月, 2021 5 次提交

livepatch: put memory alloc and free out stop machine · 000c0197

由 Ye Weihua 提交于 5月 31, 2021

hulk inclusion
category: feature
bugzilla: 51924
CVE: NA

---------------------------

When a livepatch is insmod, stop machine will stop other cores, which
interrupts services. Therefore, the shorter the stop machine duration,
the better. The application and release of memory from the stop machine
can shorten the time for stopping the machine.

Especially, module_alloc and module_memfree is a kind of vmalloc, that
may sleep when called. So it is not permitted to use them in stop machine
context.
Signed-off-by: NYe Weihua <yeweihua4@huawei.com>
Reviewed-by: NYang Jihong <yangjihong1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

000c0197

livepatch/core: Support function force patched/unpatched · 673d37a2

由 Cheng Jian 提交于 5月 29, 2021

euler inclusion
category: feature
bugzilla: 51921
CVE: N/A

----------------------------------------

Some functions in the kernel are always on the stack of some
thread in the system. Attempts to patch these function will
currently always fail the activeness safety check.

However, through human inspection, it can be determined that,
for a particular function, consistency is maintained even if
the old and new versions of the function run concurrently.

commit 2e93c5e1e3dc ("support forced patching") in kpatch-build
introduces a KPATCH_FORCE_UNSAFE() macro to define patched
functions that such be exempted from the activeness safety
check. now kernel implement this feature.
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-by: NLi Bin <huawei.libin@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: NDong Kai <dongkai11@huawei.com>
Signed-off-by: NYe Weihua <yeweihua4@huawei.com>
Reviewed-by: NYang Jihong <yangjihong1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

673d37a2

livepatch/ppc64: Support use func_descr for new_func · 8ef169ed

由 Cheng Jian 提交于 5月 29, 2021

hulk inclusion
category: feature
bugzilla: 51924
CVE: NA

---------------------------

The ppc64 ABI V1 function pointer points to the function descriptor,
which we use in the sample demo.

        $cat /proc/kallsyms | grep  livepatch_cmdline_proc_show
        80000000000d4830 d livepatch_cmdline_proc_show  [livepatch_sample]      -=> func descr
        80000000000d40c0 t .livepatch_cmdline_proc_show [livepatch_sample]      -=> func addr

However, the livepatch module made by kpatch just passes the address of
the function to kernel(saved in func->new_func), so the kernel needs to
obtain the toc address and combine the function descriptors to implement
long jump.
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Signed-off-by: NDong Kai <dongkai11@huawei.com>
Signed-off-by: NYe Weihua <yeweihua4@huawei.com>
Reviewed-by: NKuohai Xu <xukuohai@huawei.com>
Reviewed-by: NYang Jihong <yangjihong1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8ef169ed

livepatch/ppc64: Implement livepatch without ftrace for ppc64be · ac81d625

由 Cheng Jian 提交于 5月 29, 2021

hulk inclusion
category: feature
bugzilla: 51924
CVE: NA

---------------------------

Initially completed the livepatch for ppc64be, the call from
the old function to the new function, using stub space, this
is actually problematic, because we cannot effectively recover
R2. This problem will be fixed later.
Signed-off-by: NCheng Jian <cj.chengjian@huawei.com>
Reviewed-By: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Signed-off-by: NDong Kai <dongkai11@huawei.com>
Signed-off-by: NYe Weihua <yeweihua4@huawei.com>
Reviewed-by: NKuohai Xu <xukuohai@huawei.com>
Reviewed-by: NYang Jihong <yangjihong1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ac81d625

livepatch/core: Revert module_enable_ro and module_disable_ro · f440d90f

由 Dong Kai 提交于 5月 29, 2021

hulk inclusion
category: feature
bugzilla: 51921
CVE: NA

---------------------------

After commit d556e1be ("livepatch: Remove module_disable_ro() usage")
and commit 0d9fbf78 ("module: Remove module_disable_ro()") and
commit e6eff437 ("module: Make module_enable_ro() static again") merged,
the module_disable_ro is removed and module_enable_ro is make static.

It's ok for x86/ppc platform because the livepatch module relocation is
done by text poke func which internally modify the text addr by remap
to high virtaddr which has write permission.

However for arm/arm64 platform, it's apply_relocate[_add] still directly
modify the text code so we should change the module text permission before
relocation. Otherwise it will lead to following problem:

  Unable to handle kernel write to read-only memory at virtual address ffff800008a95288
  Mem abort info:
  ESR = 0x9600004f
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  Data abort info:
  ISV = 0, ISS = 0x0000004f
  CM = 0, WnR = 1
  swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000004133c000
  [ffff800008a95288] pgd=00000000bdfff003, p4d=00000000bdfff003, pud=00000000bdffe003,
		     pmd=0000000080ce7003, pte=0040000080d5d783
  Internal error: Oops: 9600004f [#1] PREEMPT SMP
  Modules linked in: livepatch_testmod_drv(OK+) testmod_drv(O)
  CPU: 0 PID: 139 Comm: insmod Tainted: G           O  K   5.10.0-01131-gf6b4602e09b2-dirty #35
  Hardware name: linux,dummy-virt (DT)
  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
  pc : reloc_insn_imm+0x54/0x78
  lr : reloc_insn_imm+0x50/0x78
  sp : ffff800011cf3910
  ...
  Call trace:
   reloc_insn_imm+0x54/0x78
   apply_relocate_add+0x464/0x680
   klp_apply_section_relocs+0x11c/0x148
   klp_enable_patch+0x338/0x998
   patch_init+0x338/0x1000 [livepatch_testmod_drv]
   do_one_initcall+0x60/0x1d8
   do_init_module+0x58/0x1e0
   load_module+0x1fb4/0x2688
   __do_sys_finit_module+0xc0/0x128
   __arm64_sys_finit_module+0x20/0x30
   do_el0_svc+0x84/0x1b0
   el0_svc+0x14/0x20
   el0_sync_handler+0x90/0xc8
   el0_sync+0x158/0x180
   Code: 2a0503e0 9ad42a73 97d6a499 91000673 (b90002a0)
   ---[ end trace 67dd2ef1203ed335 ]---

Though the permission change is not necessary to x86/ppc platform, consider
that the jump_label_register api may modify the text code either, we just
put the change handle here instead of putting it in arch-specific relocate.

Besides, the jump_label_module_nb callback called in jump_label_register
also maybe need motify the module code either, it sort and swap the jump
entries if necessary. So just disable ro before jump_label handling and
restore it back.
Signed-off-by: NDong Kai <dongkai11@huawei.com>
Signed-off-by: NYe Weihua <yeweihua4@huawei.com>
Reviewed-by: NYang Jihong <yangjihong1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f440d90f

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功