提交 · edf2f9444632383dc6a6b016f9e37f43cd2a6511 · openeuler / Kernel

03 6月, 2021 40 次提交

ALSA: hdspm: don't disable if not enabled · edf2f944

由 Tong Zhang 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 9df07b0661e7793e54464f9f115eba25397d0d5c
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 790f5719 ]

hdspm wants to disable a not enabled pci device, which makes kernel
throw a warning. Make sure the device is enabled before calling disable.

[    1.786391] snd_hdspm 0000:00:03.0: disabling already-disabled device
[    1.786400] WARNING: CPU: 0 PID: 182 at drivers/pci/pci.c:2146 pci_disable_device+0x91/0xb0
[    1.795181] Call Trace:
[    1.795320]  snd_hdspm_card_free+0x58/0xa0 [snd_hdspm]
[    1.795595]  release_card_device+0x4b/0x80 [snd]
[    1.795860]  device_release+0x3b/0xa0
[    1.796072]  kobject_put+0x94/0x1b0
[    1.796260]  put_device+0x13/0x20
[    1.796438]  snd_card_free+0x61/0x90 [snd]
[    1.796659]  snd_hdspm_probe+0x97b/0x1440 [snd_hdspm]
Suggested-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NTong Zhang <ztong0001@gmail.com>
Link: https://lore.kernel.org/r/20210321153840.378226-3-ztong0001@gmail.comSigned-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

edf2f944

ALSA: hdsp: don't disable if not enabled · 1f4e23de

由 Tong Zhang 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit a950cd8cb05d358fcbcd84c1a0c4760351adc82a
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 507cdb9a ]

hdsp wants to disable a not enabled pci device, which makes kernel
throw a warning. Make sure the device is enabled before calling disable.

[    1.758292] snd_hdsp 0000:00:03.0: disabling already-disabled device
[    1.758327] WARNING: CPU: 0 PID: 180 at drivers/pci/pci.c:2146 pci_disable_device+0x91/0xb0
[    1.766985] Call Trace:
[    1.767121]  snd_hdsp_card_free+0x94/0xf0 [snd_hdsp]
[    1.767388]  release_card_device+0x4b/0x80 [snd]
[    1.767639]  device_release+0x3b/0xa0
[    1.767838]  kobject_put+0x94/0x1b0
[    1.768027]  put_device+0x13/0x20
[    1.768207]  snd_card_free+0x61/0x90 [snd]
[    1.768430]  snd_hdsp_probe+0x524/0x5e0 [snd_hdsp]
Suggested-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NTong Zhang <ztong0001@gmail.com>
Link: https://lore.kernel.org/r/20210321153840.378226-2-ztong0001@gmail.comSigned-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1f4e23de

i2c: bail out early when RDWR parameters are wrong · b54c5f73

由 Wolfram Sang 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit faed3150a4368d8c199d3d93340410af672c2237
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 71581562 ]

The buggy parameters currently get caught later, but emit a noisy WARN.
Userspace should not be able to trigger this, so add similar checks much
earlier. Also avoids some unneeded code paths, of course. Apply kernel
coding stlye to a comment while here.

Reported-by: syzbot+ffb0b3ffa6cfbc7d7b3f@syzkaller.appspotmail.com
Tested-by: syzbot+ffb0b3ffa6cfbc7d7b3f@syzkaller.appspotmail.com
Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: NWolfram Sang <wsa@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b54c5f73

Bluetooth: Fix incorrect status handling in LE PHY UPDATE event · c50ed066

由 Ayush Garg 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 18df2bc13b1f0bce0338ccc77b184a2fa6a6645e
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 87df8bcc ]

Skip updation of tx and rx PHYs values, when PHY Update
event's status is not successful.
Signed-off-by: NAyush Garg <ayush.garg@samsung.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c50ed066

ASoC: rsnd: core: Check convert rate in rsnd_hw_params · 0137b9dd

由 Mikhail Durnev 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 879a96d817ed7268712ed65e6551ed4654d86ce8
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 19c6a63c ]

snd_pcm_hw_params_set_rate_near can return incorrect sample rate in
some cases, e.g. when the backend output rate is set to some value higher
than 48000 Hz and the input rate is 8000 Hz. So passing the value returned
by snd_pcm_hw_params_set_rate_near to snd_pcm_hw_params will result in
"FSO/FSI ratio error" and playing no audio at all while the userland
is not properly notified about the issue.

If SRC is unable to convert the requested sample rate to the sample rate
the backend is using, then the requested sample rate should be adjusted in
rsnd_hw_params. The userland will be notified about that change in the
returned hw_params structure.
Signed-off-by: NMikhail Durnev <mikhail_durnev@mentor.com>
Link: https://lore.kernel.org/r/1615870055-13954-1-git-send-email-mikhail_durnev@mentor.comSigned-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0137b9dd

net: stmmac: Set FIFO sizes for ipq806x · deb79cf8

由 Jonathan McDowell 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit a2aeb5de26c1800e530b29e9a157c92c5a827293
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit e127906b ]

Commit eaf4fac4 ("net: stmmac: Do not accept invalid MTU values")
started using the TX FIFO size to verify what counts as a valid MTU
request for the stmmac driver.  This is unset for the ipq806x variant.
Looking at older patches for this it seems the RX + TXs buffers can be
up to 8k, so set appropriately.

(I sent this as an RFC patch in June last year, but received no replies.
I've been running with this on my hardware (a MikroTik RB3011) since
then with larger MTUs to support both the internal qca8k switch and
VLANs with no problems. Without the patch it's impossible to set the
larger MTU required to support this.)
Signed-off-by: NJonathan McDowell <noodles@earth.li>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

deb79cf8

net/mlx5e: Use net_prefetchw instead of prefetchw in MPWQE TX datapath · 22ee8810

由 Maxim Mikityanskiy 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit c0a62a441bbdd2cb90c6e366f185d32f554f840b
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 991b2654 ]

Commit e20f0dbf ("net/mlx5e: RX, Add a prefetch command for small
L1_CACHE_BYTES") switched to using net_prefetchw at all places in mlx5e.
In the same time frame, commit 5af75c74 ("net/mlx5e: Enhanced TX
MPWQE for SKBs") added one more usage of prefetchw. When these two
changes were merged, this new occurrence of prefetchw wasn't replaced
with net_prefetchw.

This commit fixes this last occurrence of prefetchw in
mlx5e_tx_mpwqe_session_start, making the same change that was done in
mlx5e_xdp_mpwqe_session_start.
Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: NSaeed Mahameed <saeedm@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

22ee8810

ASoC: Intel: bytcr_rt5640: Enable jack-detect support on Asus T100TAF · 2f30030d

由 Hans de Goede 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 2d17c58a3a4f8dc4e7e770ebcdf4041eff67560f
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit b7c7203a ]

The Asus T100TAF uses the same jack-detect settings as the T100TA,
this has been confirmed on actual hardware.

Add these settings to the T100TAF quirks to enable jack-detect support
on the T100TAF.
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Acked-by: NPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210312114850.13832-1-hdegoede@redhat.comSigned-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2f30030d

tipc: convert dest node's address to network order · ba188e8b

由 Hoang Le 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 3d1bede85632a6330bacb77a90eeeb5a956a78d0
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 1980d375 ]

(struct tipc_link_info)->dest is in network order (__be32), so we must
convert the value to network order before assigning. The problem detected
by sparse:

net/tipc/netlink_compat.c:699:24: warning: incorrect type in assignment (different base types)
net/tipc/netlink_compat.c:699:24:    expected restricted __be32 [usertype] dest
net/tipc/netlink_compat.c:699:24:    got int
Acked-by: NJon Maloy <jmaloy@redhat.com>
Signed-off-by: NHoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ba188e8b

fs: dlm: flush swork on shutdown · 18cb23a0

由 Alexander Aring 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit a407b5881686a3c08902d54d958e28f7bad4070a
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit eec054b5 ]

This patch fixes the flushing of send work before shutdown. The function
cancel_work_sync() is not the right workqueue functionality to use here
as it would cancel the work if the work queues itself. In cases of
EAGAIN in send() for dlm message we need to be sure that everything is
send out before. The function flush_work() will ensure that every send
work is be done inclusive in EAGAIN cases.
Signed-off-by: NAlexander Aring <aahringo@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

18cb23a0

fs: dlm: check on minimum msglen size · a4ca1658

由 Alexander Aring 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit ff58d1c72edfc000b3a4ec9d5c963023ef869999
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 710176e8 ]

This patch adds an additional check for minimum dlm header size which is
an invalid dlm message and signals a broken stream. A msglen field cannot
be less than the dlm header size because the field is inclusive header
lengths.
Signed-off-by: NAlexander Aring <aahringo@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a4ca1658

fs: dlm: add errno handling to check callback · bc131a41

由 Alexander Aring 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit ca973d2aeaf70c15e6663be3f71ba1b17a127051
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 8aa9540b ]

This allows to return individual errno values for the config attribute
check callback instead of returning invalid argument only.
Signed-off-by: NAlexander Aring <aahringo@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bc131a41

fs: dlm: fix debugfs dump · eb11bdab

由 Alexander Aring 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 06d59d21cb05765e72a53b53a86c6be106bece88
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit 92c48950 ]

This patch fixes the following message which randomly pops up during
glocktop call:

seq_file: buggy .next function table_seq_next did not update position index

The issue is that seq_read_iter() in fs/seq_file.c also needs an
increment of the index in an non next record case as well which this
patch fixes otherwise seq_read_iter() will print out the above message.
Signed-off-by: NAlexander Aring <aahringo@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

eb11bdab

ath11k: fix thermal temperature read · 5bba8c6f

由 Pradeep Kumar Chitrapu 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit bd6017a942b9343c1e6a99eef9c64fa264a1a53b
bugzilla: 51875
CVE: NA

--------------------------------

[ Upstream commit e3de5bb7 ]

Fix dangling pointer in thermal temperature event which causes
incorrect temperature read.

Tested-on: IPQ8074 AHB WLAN.HK.2.4.0.1-00041-QCAHKSWPL_SILICONZ-1
Signed-off-by: NPradeep Kumar Chitrapu <pradeepc@codeaurora.org>
Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210218182708.8844-1-pradeepc@codeaurora.orgSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5bba8c6f

kvm: Cap halt polling at kvm->max_halt_poll_ns · 9403f6cc

由 David Matlack 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 21756f878e827784213df136e678fed0ce9f0e30
bugzilla: 51875
CVE: NA

--------------------------------

commit 258785ef upstream.

When growing halt-polling, there is no check that the poll time exceeds
the per-VM limit. It's possible for vcpu->halt_poll_ns to grow past
kvm->max_halt_poll_ns and stay there until a halt which takes longer
than kvm->halt_poll_ns.
Signed-off-by: NDavid Matlack <dmatlack@google.com>
Signed-off-by: NVenkatesh Srinivas <venkateshs@chromium.org>
Message-Id: <20210506152442.4010298-1-venkateshs@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9403f6cc

cpufreq: intel_pstate: Use HWP if enabled by platform firmware · 4c94d08c

由 Rafael J. Wysocki 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 53d7eed0315a7e6eaf8664c11c123095cf356ece
bugzilla: 51875
CVE: NA

--------------------------------

commit e5af36b2 upstream.

It turns out that there are systems where HWP is enabled during
initialization by the platform firmware (BIOS), but HWP EPP support
is not advertised.

After commit 7aa10312 ("cpufreq: intel_pstate: Avoid enabling HWP
if EPP is not supported") intel_pstate refuses to use HWP on those
systems, but the fallback PERF_CTL interface does not work on them
either because of enabled HWP, and once enabled, HWP cannot be
disabled.  Consequently, the users of those systems cannot control
CPU performance scaling.

Address this issue by making intel_pstate use HWP unconditionally if
it is enabled already when the driver starts.

Fixes: 7aa10312 ("cpufreq: intel_pstate: Avoid enabling HWP if EPP is not supported")
Reported-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4c94d08c

PM: runtime: Fix unpaired parent child_count for force_resume · 19e7fce9

由 Tony Lindgren 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 182f1f72af2e6803f1470a7e16a76ef0c63cc124
bugzilla: 51875
CVE: NA

--------------------------------

commit c745253e upstream.

As pm_runtime_need_not_resume() relies also on usage_count, it can return
a different value in pm_runtime_force_suspend() compared to when called in
pm_runtime_force_resume(). Different return values can happen if anything
calls PM runtime functions in between, and causes the parent child_count
to increase on every resume.

So far I've seen the issue only for omapdrm that does complicated things
with PM runtime calls during system suspend for legacy reasons:

omap_atomic_commit_tail() for omapdrm.0
 dispc_runtime_get()
  wakes up 58000000.dss as it's the dispc parent
   dispc_runtime_resume()
    rpm_resume() increases parent child_count
 dispc_runtime_put() won't idle, PM runtime suspend blocked
pm_runtime_force_suspend() for 58000000.dss, !pm_runtime_need_not_resume()
 __update_runtime_status()
system suspended
pm_runtime_force_resume() for 58000000.dss, pm_runtime_need_not_resume()
 pm_runtime_enable() only called because of pm_runtime_need_not_resume()
omap_atomic_commit_tail() for omapdrm.0
 dispc_runtime_get()
  wakes up 58000000.dss as it's the dispc parent
   dispc_runtime_resume()
    rpm_resume() increases parent child_count
 dispc_runtime_put() won't idle, PM runtime suspend blocked
...
rpm_suspend for 58000000.dss but parent child_count is now unbalanced

Let's fix the issue by adding a flag for needs_force_resume and use it in
pm_runtime_force_resume() instead of pm_runtime_need_not_resume().

Additionally omapdrm system suspend could be simplified later on to avoid
lots of unnecessary PM runtime calls and the complexity it adds. The
driver can just use internal functions that are shared between the PM
runtime and system suspend related functions.

Fixes: 4918e1f8 ("PM / runtime: Rework pm_runtime_force_suspend/resume()")
Signed-off-by: NTony Lindgren <tony@atomide.com>
Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
Tested-by: NTomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

19e7fce9

ACPI: PM: Add ACPI ID of Alder Lake Fan · b379cc8c

由 Sumeet Pawnikar 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit e97da47e9be04b6cc98451bd6cac779d1f1a74dc
bugzilla: 51875
CVE: NA

--------------------------------

commit 2404b874 upstream.

Add a new unique fan ACPI device ID for Alder Lake to
support it in acpi_dev_pm_attach() function.

Fixes: 38748bcb ("ACPI: DPTF: Support Alder Lake")
Signed-off-by: NSumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Acked-by: NZhang Rui <rui.zhang@intel.com>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b379cc8c

KVM/VMX: Invoke NMI non-IST entry instead of IST entry · 85348c13

由 Lai Jiangshan 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit bfccc4eade2bec1493f891ebcd3c6751eee971c9
bugzilla: 51875
CVE: NA

--------------------------------

commit a217a659 upstream.

In VMX, the host NMI handler needs to be invoked after NMI VM-Exit.
Before commit 1a5488ef ("KVM: VMX: Invoke NMI handler via indirect
call instead of INTn"), this was done by INTn ("int $2"). But INTn
microcode is relatively expensive, so the commit reworked NMI VM-Exit
handling to invoke the kernel handler by function call.

But this missed a detail. The NMI entry point for direct invocation is
fetched from the IDT table and called on the kernel stack.  But on 64-bit
the NMI entry installed in the IDT expects to be invoked on the IST stack.
It relies on the "NMI executing" variable on the IST stack to work
correctly, which is at a fixed position in the IST stack.  When the entry
point is unexpectedly called on the kernel stack, the RSP-addressed "NMI
executing" variable is obviously also on the kernel stack and is
"uninitialized" and can cause the NMI entry code to run in the wrong way.

Provide a non-ist entry point for VMX which shares the C-function with
the regular NMI entry and invoke the new asm entry point instead.

On 32-bit this just maps to the regular NMI entry point as 32-bit has no
ISTs and is not affected.

[ tglx: Made it independent for backporting, massaged changelog ]

Fixes: 1a5488ef ("KVM: VMX: Invoke NMI handler via indirect call instead of INTn")
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NLai Jiangshan <laijs@linux.alibaba.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87r1imi8i1.ffs@nanos.tec.linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

85348c13

KVM: x86/mmu: Remove the defunct update_pte() paging hook · 318c384e

由 Sean Christopherson 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 21f317826e170c1cf03944d7ce7b9142c238fb71
bugzilla: 51875
CVE: NA

--------------------------------

commit c5e2184d upstream.

Remove the update_pte() shadow paging logic, which was obsoleted by
commit 4731d4c7 ("KVM: MMU: out of sync shadow core"), but never
removed.  As pointed out by Yu, KVM never write protects leaf page
tables for the purposes of shadow paging, and instead marks their
associated shadow page as unsync so that the guest can write PTEs at
will.

The update_pte() path, which predates the unsync logic, optimizes COW
scenarios by refreshing leaf SPTEs when they are written, as opposed to
zapping the SPTE, restarting the guest, and installing the new SPTE on
the subsequent fault.  Since KVM no longer write-protects leaf page
tables, update_pte() is unreachable and can be dropped.
Reported-by: NYu Zhang <yu.c.zhang@intel.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210115004051.4099250-1-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

318c384e

tpm, tpm_tis: Reserve locality in tpm_tis_resume() · 6518d6ad

由 Jarkko Sakkinen 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 53171e68a509f185d38c6df9fb9727e3ca90348c
bugzilla: 51875
CVE: NA

--------------------------------

commit 8a2d296a upstream.

Reserve locality in tpm_tis_resume(), as it could be unsert after waking
up from a sleep state.

Cc: stable@vger.kernel.org
Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reported-by: NHans de Goede <hdegoede@redhat.com>
Fixes: a3fbfae8 ("tpm: take TPM chip power gating out of tpm_transmit()")
Signed-off-by: NJarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6518d6ad

tpm, tpm_tis: Extend locality handling to TPM2 in tpm_tis_gen_interrupt() · 6411963f

由 Jarkko Sakkinen 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 923866165610d831fe6f5e53379bd57dfa553697
bugzilla: 51875
CVE: NA

--------------------------------

commit e630af7d upstream.

The earlier fix (linked) only partially fixed the locality handling bug
in tpm_tis_gen_interrupt(), i.e. only for TPM 1.x.

Extend the locality handling to cover TPM2.

Cc: Hans de Goede <hdegoede@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-integrity/20210220125534.20707-1-jarkko@kernel.org/
Fixes: a3fbfae8 ("tpm: take TPM chip power gating out of tpm_transmit()")
Reported-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: NJarkko Sakkinen <jarkko@kernel.org>
Tested-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6411963f

tpm: fix error return code in tpm2_get_cc_attrs_tbl() · 5963f1b7

由 Zhen Lei 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 8fe5a459186a2895041e97ae8c265d79725aaed5
bugzilla: 51875
CVE: NA

--------------------------------

commit 1df83992 upstream.

If the total number of commands queried through TPM2_CAP_COMMANDS is
different from that queried through TPM2_CC_GET_CAPABILITY, it indicates
an unknown error. In this case, an appropriate error code -EFAULT should
be returned. However, we currently do not explicitly assign this error
code to 'rc'. As a result, 0 was incorrectly returned.

Cc: stable@vger.kernel.org
Fixes: 58472f5c("tpm: validate TPM 2.0 commands")
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: NJarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: NJarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5963f1b7

KEYS: trusted: Fix memory leak on object td · 59da819e

由 Colin Ian King 提交于 5月 25, 2021

stable inclusion
from stable-5.10.38
commit 31c9a4b24d86cbb36ff0d7a085725a3b4f0138c8
bugzilla: 51875
CVE: NA

--------------------------------

commit 83a775d5 upstream.

Two error return paths are neglecting to free allocated object td,
causing a memory leak. Fix this by returning via the error return
path that securely kfree's td.

Fixes clang scan-build warning:
security/keys/trusted-keys/trusted_tpm1.c:496:10: warning: Potential
memory leak [unix.Malloc]

Cc: stable@vger.kernel.org
Fixes: 5df16caa ("KEYS: trusted: Fix incorrect handling of tpm_get_random()")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Reviewed-by: NJarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: NJarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

59da819e

sctp: delay auto_asconf init until binding the first addr · 7243647b

由 Xin Long 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 42f1b8653f85924743ea5b57b051a4e1f05b5e43
bugzilla: 51868
CVE: NA

--------------------------------

commit 34e5b011 upstream.

As Or Cohen described:

  If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
  held and sp->do_auto_asconf is true, then an element is removed
  from the auto_asconf_splist without any proper locking.

  This can happen in the following functions:
  1. In sctp_accept, if sctp_sock_migrate fails.
  2. In inet_create or inet6_create, if there is a bpf program
     attached to BPF_CGROUP_INET_SOCK_CREATE which denies
     creation of the sctp socket.

This patch is to fix it by moving the auto_asconf init out of
sctp_init_sock(), by which inet_create()/inet6_create() won't
need to operate it in sctp_destroy_sock() when calling
sk_common_release().

It also makes more sense to do auto_asconf init while binding the
first addr, as auto_asconf actually requires an ANY addr bind,
see it in sctp_addr_wq_timeout_handler().

This addresses CVE-2021-23133.

Fixes: 61023658 ("bpf: Add new cgroup attach type to enable sock modifications")
Reported-by: NOr Cohen <orcohen@paloaltonetworks.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7243647b

Revert "net/sctp: fix race condition in sctp_destroy_sock" · 3c989608

由 Xin Long 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 14919cdf68d03ae59d52fb78e4f998996333e629
bugzilla: 51868
CVE: NA

--------------------------------

commit 01bfe5e8 upstream.

This reverts commit b166a20b.

This one has to be reverted as it introduced a dead lock, as
syzbot reported:

       CPU0                    CPU1
       ----                    ----
  lock(&net->sctp.addr_wq_lock);
                               lock(slock-AF_INET6);
                               lock(&net->sctp.addr_wq_lock);
  lock(slock-AF_INET6);

CPU0 is the thread of sctp_addr_wq_timeout_handler(), and CPU1
is that of sctp_close().

The original issue this commit fixed will be fixed in the next
patch.

Reported-by: syzbot+959223586843e69a2674@syzkaller.appspotmail.com
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3c989608

smp: Fix smp_call_function_single_async prototype · 6f1d7916

由 Arnd Bergmann 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 41f1aed56de5b478002e98c3572664e592666f13
bugzilla: 51868
CVE: NA

--------------------------------

commit 1139aeb1 upstream.

As of commit 966a9671 ("smp: Avoid using two cache lines for struct
call_single_data"), the smp code prefers 32-byte aligned call_single_data
objects for performance reasons, but the block layer includes an instance
of this structure in the main 'struct request' that is more senstive
to size than to performance here, see 4ccafe03 ("block: unalign
call_single_data in struct request").

The result is a violation of the calling conventions that clang correctly
points out:

block/blk-mq.c:630:39: warning: passing 8-byte aligned argument to 32-byte aligned parameter 2 of 'smp_call_function_single_async' may result in an unaligned pointer access [-Walign-mismatch]
                smp_call_function_single_async(cpu, &rq->csd);

It does seem that the usage of the call_single_data without cache line
alignment should still be allowed by the smp code, so just change the
function prototype so it accepts both, but leave the default alignment
unchanged for the other users. This seems better to me than adding
a local hack to shut up an otherwise correct warning in the caller.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NJens Axboe <axboe@kernel.dk>
Link: https://lkml.kernel.org/r/20210505211300.3174456-1-arnd@kernel.org
[nc: Fix conflicts, modify rq_csd_init]
Signed-off-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6f1d7916

net: Only allow init netns to set default tcp cong to a restricted algo · 06756b6f

由 Jonathon Reinhart 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 6c1ea8bee75df8fe2184a50fcd0f70bf82986f42
bugzilla: 51868
CVE: NA

--------------------------------

commit 8d432592 upstream.

tcp_set_default_congestion_control() is netns-safe in that it writes
to &net->ipv4.tcp_congestion_control, but it also sets
ca->flags |= TCP_CONG_NON_RESTRICTED which is not namespaced.
This has the unintended side-effect of changing the global
net.ipv4.tcp_allowed_congestion_control sysctl, despite the fact that it
is read-only: 97684f09 ("net: Make tcp_allowed_congestion_control
readonly in non-init netns")

Resolve this netns "leak" by only allowing the init netns to set the
default algorithm to one that is restricted. This restriction could be
removed if tcp_allowed_congestion_control were namespace-ified in the
future.

This bug was uncovered with
https://github.com/JonathonReinhart/linux-netns-sysctl-verify

Fixes: 6670e152 ("tcp: Namespace-ify sysctl_tcp_default_congestion_control")
Signed-off-by: NJonathon Reinhart <jonathon.reinhart@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

06756b6f

bpf: Prevent writable memory-mapping of read-only ringbuf pages · 1d2f9617

由 Andrii Nakryiko 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 00d9f429af039a76a301c1eb7b9e617e9caaf7d2
bugzilla: 51868
CVE: NA

--------------------------------

commit 04ea3086 upstream.

Only the very first page of BPF ringbuf that contains consumer position
counter is supposed to be mapped as writeable by user-space. Producer
position is read-only and can be modified only by the kernel code. BPF ringbuf
data pages are read-only as well and are not meant to be modified by
user-code to maintain integrity of per-record headers.

This patch allows to map only consumer position page as writeable and
everything else is restricted to be read-only. remap_vmalloc_range()
internally adds VM_DONTEXPAND, so all the established memory mappings can't be
extended, which prevents any future violations through mremap()'ing.

Fixes: 457f4436 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Ryota Shiga (Flatt Security)
Reported-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1d2f9617

bpf, ringbuf: Deny reserve of buffers larger than ringbuf · 4fea3225

由 Thadeu Lima de Souza Cascardo 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 1ca284f0867079a34f52a6f811747695828166c6
bugzilla: 51868
CVE: NA

--------------------------------

commit 4b81cceb upstream.

A BPF program might try to reserve a buffer larger than the ringbuf size.
If the consumer pointer is way ahead of the producer, that would be
successfully reserved, allowing the BPF program to read or write out of
the ringbuf allocated area.

Reported-by: Ryota Shiga (Flatt Security)
Fixes: 457f4436 ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4fea3225

bpf: Fix alu32 const subreg bound tracking on bitwise operations · 74bf2740

由 Daniel Borkmann 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 282bfc8848eaa195d5e994bb700f2c7afb7eb3e6
bugzilla: 51868
CVE: NA

--------------------------------

commit 049c4e13 upstream.

Fix a bug in the verifier's scalar32_min_max_*() functions which leads to
incorrect tracking of 32 bit bounds for the simulation of and/or/xor bitops.
When both the src & dst subreg is a known constant, then the assumption is
that scalar_min_max_*() will take care to update bounds correctly. However,
this is not the case, for example, consider a register R2 which has a tnum
of 0xffffffff00000000, meaning, lower 32 bits are known constant and in this
case of value 0x00000001. R2 is then and'ed with a register R3 which is a
64 bit known constant, here, 0x100000002.

What can be seen in line '10:' is that 32 bit bounds reach an invalid state
where {u,s}32_min_value > {u,s}32_max_value. The reason is scalar32_min_max_*()
delegates 32 bit bounds updates to scalar_min_max_*(), however, that really
only takes place when both the 64 bit src & dst register is a known constant.
Given scalar32_min_max_*() is intended to be designed as closely as possible
to scalar_min_max_*(), update the 32 bit bounds in this situation through
__mark_reg32_known() which will set all {u,s}32_{min,max}_value to the correct
constant, which is 0x00000000 after the fix (given 0x00000001 & 0x00000002 in
32 bit space). This is possible given var32_off already holds the final value
as dst_reg->var_off is updated before calling scalar32_min_max_*().

Before fix, invalid tracking of R2:

  [...]
  9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
  9: (5f) r2 &= r3
  10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=1,s32_max_value=0,u32_min_value=1,u32_max_value=0) R3_w=inv4294967298 R10=fp0
  [...]

After fix, correct tracking of R2:

  [...]
  9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
  9: (5f) r2 &= r3
  10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=0,s32_max_value=0,u32_min_value=0,u32_max_value=0) R3_w=inv4294967298 R10=fp0
  [...]

Fixes: 3f50f132 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Fixes: 2921c90d ("bpf: Fix a verifier failure with xor")
Reported-by: Manfred Paul (@_manfp)
Reported-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NJohn Fastabend <john.fastabend@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

74bf2740

afs: Fix speculative status fetches · 7a127256

由 David Howells 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit f76e0829bbabf358ae3309b43ed18e0d32295c86
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit 22650f14 ]

The generic/464 xfstest causes kAFS to emit occasional warnings of the
form:

        kAFS: vnode modified {100055:8a} 30->31 YFS.StoreData64 (c=6015)

This indicates that the data version received back from the server did not
match the expected value (the DV should be incremented monotonically for
each individual modification op committed to a vnode).

What is happening is that a lookup call is doing a bulk status fetch
speculatively on a bunch of vnodes in a directory besides getting the
status of the vnode it's actually interested in.  This is racing with a
StoreData operation (though it could also occur with, say, a MakeDir op).

On the client, a modification operation locks the vnode, but the bulk
status fetch only locks the parent directory, so no ordering is imposed
there (thereby avoiding an avenue to deadlock).

On the server, the StoreData op handler doesn't lock the vnode until it's
received all the request data, and downgrades the lock after committing the
data until it has finished sending change notifications to other clients -
which allows the status fetch to occur before it has finished.

This means that:

 - a status fetch can access the target vnode either side of the exclusive
   section of the modification

 - the status fetch could start before the modification, yet finish after,
   and vice-versa.

 - the status fetch and the modification RPCs can complete in either order.

 - the status fetch can return either the before or the after DV from the
   modification.

 - the status fetch might regress the locally cached DV.

Some of these are handled by the previous fix[1], but that's not sufficient
because it checks the DV it received against the DV it cached at the start
of the op, but the DV might've been updated in the meantime by a locally
generated modification op.

Fix this by the following means:

 (1) Keep track of when we're performing a modification operation on a
     vnode.  This is done by marking vnode parameters with a 'modification'
     note that causes the AFS_VNODE_MODIFYING flag to be set on the vnode
     for the duration.

 (2) Alter the speculation race detection to ignore speculative status
     fetches if either the vnode is marked as being modified or the data
     version number is not what we expected.

Note that whilst the "vnode modified" warning does get recovered from as it
causes the client to refetch the status at the next opportunity, it will
also invalidate the pagecache, so changes might get lost.

Fixes: a9e5c87c ("afs: Fix speculative status fetch going out of order wrt to modifications")
Reported-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-and-reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/160605082531.252452.14708077925602709042.stgit@warthog.procyon.org.uk/ [1]
Link: https://lore.kernel.org/linux-fsdevel/161961335926.39335.2552653972195467566.stgit@warthog.procyon.org.uk/ # v1
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7a127256

mm/memory-failure: unnecessary amount of unmapping · b978cfdc

由 Jane Chu 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 949e7c5f4957cd19670daa21d0ffc93c5d314446
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit 4d75136b ]

It appears that unmap_mapping_range() actually takes a 'size' as its third
argument rather than a location, the current calling fashion causes
unnecessary amount of unmapping to occur.

Link: https://lkml.kernel.org/r/20210420002821.2749748-1-jane.chu@oracle.com
Fixes: 6100e34b ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages")
Signed-off-by: NJane Chu <jane.chu@oracle.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NNaoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b978cfdc

mm/sparse: add the missing sparse_buffer_fini() in error branch · 6ae489f1

由 Wang Wensheng 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 62d96faa74c8b00f79f84ef1d2b7c438735fdcc3
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit 2284f47f ]

sparse_buffer_init() and sparse_buffer_fini() should appear in pair, or a
WARN issue would be through the next time sparse_buffer_init() runs.

Add the missing sparse_buffer_fini() in error branch.

Link: https://lkml.kernel.org/r/20210325113155.118574-1-wangwensheng4@huawei.com
Fixes: 85c77f79 ("mm/sparse: add new sparse_init_nid() and sparse_init()")
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NOscar Salvador <osalvador@suse.de>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6ae489f1

mm: memcontrol: slab: fix obtain a reference to a freeing memcg · 197d839d

由 Muchun Song 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 31df8bc4d3feca9f9c6b2cd06fd64a111ae1a0e6
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit 9f38f03a ]

Patch series "Use obj_cgroup APIs to charge kmem pages", v5.

Since Roman's series "The new cgroup slab memory controller" applied.
All slab objects are charged with the new APIs of obj_cgroup.  The new
APIs introduce a struct obj_cgroup to charge slab objects.  It prevents
long-living objects from pinning the original memory cgroup in the
memory.  But there are still some corner objects (e.g.  allocations
larger than order-1 page on SLUB) which are not charged with the new
APIs.  Those objects (include the pages which are allocated from buddy
allocator directly) are charged as kmem pages which still hold a
reference to the memory cgroup.

E.g.  We know that the kernel stack is charged as kmem pages because the
size of the kernel stack can be greater than 2 pages (e.g.  16KB on
x86_64 or arm64).  If we create a thread (suppose the thread stack is
charged to memory cgroup A) and then move it from memory cgroup A to
memory cgroup B.  Because the kernel stack of the thread hold a
reference to the memory cgroup A.  The thread can pin the memory cgroup
A in the memory even if we remove the cgroup A.  If we want to see this
scenario by using the following script.  We can see that the system has
added 500 dying cgroups (This is not a real world issue, just a script
to show that the large kmallocs are charged as kmem pages which can pin
the memory cgroup in the memory).

	#!/bin/bash

	cat /proc/cgroups | grep memory

	cd /sys/fs/cgroup/memory
	echo 1 > memory.move_charge_at_immigrate

	for i in range{1..500}
	do
		mkdir kmem_test
		echo $$ > kmem_test/cgroup.procs
		sleep 3600 &
		echo $$ > cgroup.procs
		echo `cat kmem_test/cgroup.procs` > cgroup.procs
		rmdir kmem_test
	done

	cat /proc/cgroups | grep memory

This patchset aims to make those kmem pages to drop the reference to
memory cgroup by using the APIs of obj_cgroup.  Finally, we can see that
the number of the dying cgroups will not increase if we run the above test
script.

This patch (of 7):

The rcu_read_lock/unlock only can guarantee that the memcg will not be
freed, but it cannot guarantee the success of css_get (which is in the
refill_stock when cached memcg changed) to memcg.

  rcu_read_lock()
  memcg = obj_cgroup_memcg(old)
  __memcg_kmem_uncharge(memcg)
      refill_stock(memcg)
          if (stock->cached != memcg)
              // css_get can change the ref counter from 0 back to 1.
              css_get(&memcg->css)
  rcu_read_unlock()

This fix is very like the commit:

  eefbfa7f ("mm: memcg/slab: fix use after free in obj_cgroup_charge")

Fix this by holding a reference to the memcg which is passed to the
__memcg_kmem_uncharge() before calling __memcg_kmem_uncharge().

Link: https://lkml.kernel.org/r/20210319163821.20704-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20210319163821.20704-2-songmuchun@bytedance.com
Fixes: 3de7d4f2 ("mm: memcg/slab: optimize objcg stock draining")
Signed-off-by: NMuchun Song <songmuchun@bytedance.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

197d839d

mm/sl?b.c: remove ctor argument from kmem_cache_flags · a446df0b

由 Nikolay Borisov 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 2e95bc6cfed1dc5888d8bbc8773a8fa171dbc062
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit 37540008 ]

This argument hasn't been used since e153362a ("slub: Remove objsize
check in kmem_cache_flags()") so simply remove it.

Link: https://lkml.kernel.org/r/20210126095733.974665-1-nborisov@suse.comSigned-off-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NChristoph Lameter <cl@linux.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a446df0b

kfifo: fix ternary sign extension bugs · d4205c6d

由 Dan Carpenter 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 618fa6a35c798557c63f971cbaac1d9296fd88af
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit 926ee00e ]

The intent with this code was to return negative error codes but instead
it returns positives.

The problem is how type promotion works with ternary operations.  These
functions return long, "ret" is an int and "copied" is a u32.  The
negative error code is first cast to u32 so it becomes a high positive and
then cast to long where it's still a positive.

We could fix this by declaring "ret" as a ssize_t but let's just get rid
of the ternaries instead.

Link: https://lkml.kernel.org/r/YIE+/cK1tBzSuQPU@mwanda
Fixes: 5bf2b193 ("kfifo: add example files to the kernel sample directory")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Stefani Seibold <stefani@seibold.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d4205c6d

ia64: fix EFI_DEBUG build · 2dc4926e

由 Sergei Trofimovich 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit c02dd80655fd76556ebe5ef0288b4e67b38026f7
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit e3db00b7 ]

When enabled local debugging via `#define EFI_DEBUG 1` noticed build
failure:

    arch/ia64/kernel/efi.c:564:8: error: 'i' undeclared (first use in this function)

While at it fixed benign string format mismatches visible only when
EFI_DEBUG is enabled:

    arch/ia64/kernel/efi.c:589:11:
        warning: format '%lx' expects argument of type 'long unsigned int',
        but argument 5 has type 'u64' {aka 'long long unsigned int'} [-Wformat=]

Link: https://lkml.kernel.org/r/20210328212246.685601-1-slyfox@gentoo.org
Fixes: 14fb4209 ("efi: Merge EFI system table revision and vendor checks")
Signed-off-by: NSergei Trofimovich <slyfox@gentoo.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2dc4926e

perf session: Add swap operation for event TIME_CONV · 0c4abd0e

由 Leo Yan 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit c6b7e0b1ab8781f410b196f6a74a93e3ec90fdcf
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit 050ffc44 ]

Since commit d110162c ("perf tsc: Support cap_user_time_short for
event TIME_CONV"), the event PERF_RECORD_TIME_CONV has extended the data
structure for clock parameters.

To be backwards-compatible, this patch adds a dedicated swap operation
for the event PERF_RECORD_TIME_CONV, based on checking if the event
contains field "time_cycles", it can support both for the old and new
event formats.

Fixes: d110162c ("perf tsc: Support cap_user_time_short for event TIME_CONV")
Signed-off-by: NLeo Yan <leo.yan@linaro.org>
Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20210428120915.7123-4-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0c4abd0e

perf jit: Let convert_timestamp() to be backwards-compatible · 19c79997

由 Leo Yan 提交于 5月 24, 2021

stable inclusion
from stable-5.10.37
commit 86941f8bd46ae1ddb41239ab93d0d4959a416260
bugzilla: 51868
CVE: NA

--------------------------------

[ Upstream commit aa616f5a ]

Commit d110162c ("perf tsc: Support cap_user_time_short for
event TIME_CONV") supports the extended parameters for event TIME_CONV,
but it broke the backwards compatibility, so any perf data file with old
event format fails to convert timestamp.

This patch introduces a helper event_contains() to check if an event
contains a specific member or not.  For the backwards-compatibility, if
the event size confirms the extended parameters are supported in the
event TIME_CONV, then copies these parameters.

Committer notes:

To make this compiler backwards compatible add this patch:

  -       struct perf_tsc_conversion tc = { 0 };
  +       struct perf_tsc_conversion tc = { .time_shift = 0, };

Fixes: d110162c ("perf tsc: Support cap_user_time_short for event TIME_CONV")
Signed-off-by: NLeo Yan <leo.yan@linaro.org>
Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20210428120915.7123-3-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

19c79997

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功