提交 · 4084b9aeea94f471e4b594e669b23d56f03fadb8 · openeuler / Kernel

27 1月, 2021 40 次提交

ARM: module: add support for place relative relocations · 4084b9ae

由 Ard Biesheuvel 提交于 12月 31, 2020

mainline inclusion
from mainline-5.11-rc1
commit 22f2d230
category: bugfix
bugzilla: 46882
CVE: NA

-------------------------------------------------
When using the new adr_l/ldr_l/str_l macros to refer to external symbols
from modules, the linker may emit place relative ELF relocations that
need to be fixed up by the module loader. So add support for these.
Reviewed-by: NNicolas Pitre <nico@fluxnic.net>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
(cherry picked from commit 22f2d230)
Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

4084b9ae

ARM: assembler: introduce adr_l, ldr_l and str_l macros · f49945c1

由 Ard Biesheuvel 提交于 12月 31, 2020

mainline inclusion
from mainline-5.11-rc1
commit 0b167463
category: bugfix
bugzilla: 46882
CVE: NA

-------------------------------------------------
Like arm64, ARM supports position independent code sequences that
produce symbol references with a greater reach than the ordinary
adr/ldr instructions. Since on ARM, the adrl pseudo-instruction is
only supported in ARM mode (and not at all when using Clang), having
a adr_l macro like we do on arm64 is useful, and increases symmetry
as well.

Currently, we use open coded instruction sequences involving literals
and arithmetic operations. Instead, we can use movw/movt pairs on v7
CPUs, circumventing the D-cache entirely.

E.g., on v7+ CPUs, we can emit a PC-relative reference as follows:

       movw         <reg>, #:lower16:<sym> - (1f + 8)
       movt         <reg>, #:upper16:<sym> - (1f + 8)
  1:   add          <reg>, <reg>, pc

For older CPUs, we can emit the literal into a subsection, allowing it
to be emitted out of line while retaining the ability to perform
arithmetic on label offsets.

E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows:

       ldr          <reg>, 2f
  1:   add          <reg>, <reg>, pc
       .subsection  1
  2:   .long        <sym> - (1b + 8)
       .previous

This is allowed by the assembler because, unlike ordinary sections,
subsections are combined into a single section in the object file, and
so the label references are not true cross-section references that are
visible as relocations. (Subsections have been available in binutils
since 2004 at least, so they should not cause any issues with older
toolchains.)

So use the above to implement the macros mov_l, adr_l, ldr_l and str_l,
all of which will use movw/movt pairs on v7 and later CPUs, and use
PC-relative literals otherwise.
Reviewed-by: NNicolas Pitre <nico@fluxnic.net>
Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
(cherry picked from commit 0b167463)
Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

f49945c1

scsi: target: Fix XCOPY NAA identifier lookup · 2dc991b9

由 David Disseldorp 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 6f1e88527c1869de08632efa2cc796e0131850dc
bugzilla: 47429

--------------------------------

commit 2896c938 upstream.

When attempting to match EXTENDED COPY CSCD descriptors with corresponding
se_devices, target_xcopy_locate_se_dev_e4() currently iterates over LIO's
global devices list which includes all configured backstores.

This change ensures that only initiator-accessible backstores are
considered during CSCD descriptor lookup, according to the session's
se_node_acl LUN list.

To avoid LUN removal race conditions, device pinning is changed from being
configfs based to instead using the se_node_acl lun_ref.

Reference: CVE-2020-28374
Fixes: cbf031f4 ("target: Add support for EXTENDED_COPY copy offload emulation")
Reviewed-by: NLee Duncan <lduncan@suse.com>
Signed-off-by: NDavid Disseldorp <ddiss@suse.de>
Signed-off-by: NMike Christie <michael.christie@oracle.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

2dc991b9

rtlwifi: rise completion at the last step of firmware callback · c3b744f9

由 Ping-Ke Shih 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 513729aecb53cdd0ba4e5e5aebc8b2fddcb0131e
bugzilla: 47429

--------------------------------

commit 4dfde294 upstream.

request_firmware_nowait() which schedules another work is used to load
firmware when USB is probing. If USB is unplugged before running the
firmware work, it goes disconnect ops, and then causes use-after-free.
Though we wait for completion of firmware work before freeing the hw,
firmware callback rises completion too early. So I move it to the
last step.

usb 5-1: Direct firmware load for rtlwifi/rtl8192cufw.bin failed with error -2
rtlwifi: Loading alternative firmware rtlwifi/rtl8192cufw.bin
rtlwifi: Selected firmware is not available
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

==================================================================
BUG: KASAN: use-after-free in rtl_fw_do_work.cold+0x68/0x6a drivers/net/wireless/realtek/rtlwifi/core.c:93
Write of size 4 at addr ffff8881454cff50 by task kworker/0:6/7379

CPU: 0 PID: 7379 Comm: kworker/0:6 Not tainted 5.10.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events request_firmware_work_func
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
 rtl_fw_do_work.cold+0x68/0x6a drivers/net/wireless/realtek/rtlwifi/core.c:93
 request_firmware_work_func+0x12c/0x230 drivers/base/firmware_loader/main.c:1079
 process_one_work+0x933/0x1520 kernel/workqueue.c:2272
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2418
 kthread+0x38c/0x460 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

The buggy address belongs to the page:
page:00000000f54435b3 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1454cf
flags: 0x200000000000000()
raw: 0200000000000000 0000000000000000 ffffea00051533c8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8881454cfe00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff8881454cfe80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff8881454cff00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                                 ^
 ffff8881454cff80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff8881454d0000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Reported-by: syzbot+65be4277f3c489293939@syzkaller.appspotmail.com
Signed-off-by: NPing-Ke Shih <pkshih@realtek.com>
Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201214053106.7748-1-pkshih@realtek.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>

c3b744f9

xsk: Fix memory leak for failed bind · 2fa8ad24

由 Magnus Karlsson 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 0fae7d269ef7343e052bb66d4f79022e4456fe82
bugzilla: 47429

--------------------------------

commit 8bee6833 upstream.

Fix a possible memory leak when a bind of an AF_XDP socket fails. When
the fill and completion rings are created, they are tied to the
socket. But when the buffer pool is later created at bind time, the
ownership of these two rings are transferred to the buffer pool as
they might be shared between sockets (and the buffer pool cannot be
created until we know what we are binding to). So, before the buffer
pool is created, these two rings are cleaned up with the socket, and
after they have been transferred they are cleaned up together with
the buffer pool.

The problem is that ownership was transferred before it was absolutely
certain that the buffer pool could be created and initialized
correctly and when one of these errors occurred, the fill and
completion rings did neither belong to the socket nor the pool and
where therefore leaked. Solve this by moving the ownership transfer
to the point where the buffer pool has been completely set up and
there is no way it can fail.

Fixes: 7361f9c3 ("xsk: Move fill and completion rings to buffer pool")
Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com
Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NBjörn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20201214085127.3960-1-magnus.karlsson@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

2fa8ad24

KVM: x86: fix shift out of bounds reported by UBSAN · 2ff92444

由 Paolo Bonzini 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 563135ec664ffb80a2297e94d618b04b228a1262
bugzilla: 47429

--------------------------------

commit 2f80d502 upstream.

Since we know that e >= s, we can reassociate the left shift,
changing the shifted number from 1 to 2 in exchange for
decreasing the right hand side by 1.

Reported-by: syzbot+e87846c48bf72bc85311@syzkaller.appspotmail.com
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

2ff92444

x86/mtrr: Correct the range check before performing MTRR type lookups · 87fb483d

由 Ying-Tsun Huang 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 02ccda90ef7e23a225b68789bce9e8353f9caa1f
bugzilla: 47429

--------------------------------

commit cb7f4a8b upstream.

In mtrr_type_lookup(), if the input memory address region is not in the
MTRR, over 4GB, and not over the top of memory, a write-back attribute
is returned. These condition checks are for ensuring the input memory
address region is actually mapped to the physical memory.

However, if the end address is just aligned with the top of memory,
the condition check treats the address is over the top of memory, and
write-back attribute is not returned.

And this hits in a real use case with NVDIMM: the nd_pmem module tries
to map NVDIMMs as cacheable memories when NVDIMMs are connected. If a
NVDIMM is the last of the DIMMs, the performance of this NVDIMM becomes
very low since it is aligned with the top of memory and its memory type
is uncached-minus.

Move the input end address change to inclusive up into
mtrr_type_lookup(), before checking for the top of memory in either
mtrr_type_lookup_{variable,fixed}() helpers.

 [ bp: Massage commit message. ]

Fixes: 0cc705f5 ("x86/mm/mtrr: Clean up mtrr_type_lookup()")
Signed-off-by: NYing-Tsun Huang <ying-tsun.huang@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201215070721.4349-1-ying-tsun.huang@amd.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

87fb483d

dmaengine: idxd: off by one in cleanup code · 06164803

由 Dan Carpenter 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 6e3c67976eda30959833d852bc13c7d0a342cfa9
bugzilla: 47429

--------------------------------

commit ff58f7dd upstream.

The clean up is off by one so this will start at "i" and it should start
with "i - 1" and then it doesn't unregister the zeroeth elements in the
array.

Fixes: c52ca478 ("dmaengine: idxd: add configuration component of driver")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/X9nFeojulsNqUSnG@mwandaSigned-off-by: NVinod Koul <vkoul@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

06164803

netfilter: nft_dynset: report EOPNOTSUPP on missing set feature · 44a4771c

由 Pablo Neira Ayuso 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 8b109f4cd1dc2224f900702483be81d61beab864
bugzilla: 47429

--------------------------------

commit 95cd4bca upstream.

If userspace requests a feature which is not available the original set
definition, then bail out with EOPNOTSUPP. If userspace sends
unsupported dynset flags (new feature not supported by this kernel),
then report EOPNOTSUPP to userspace. EINVAL should be only used to
report malformed netlink messages from userspace.

Fixes: 22fe54d5 ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

44a4771c

netfilter: xt_RATEEST: reject non-null terminated string from userspace · c5442287

由 Florian Westphal 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 810bc977f8a4ae5c68aa35d75ae52c45ae6db0c7
bugzilla: 47429

--------------------------------

commit 6cb56218 upstream.

syzbot reports:
detected buffer overflow in strlen
[..]
Call Trace:
 strlen include/linux/string.h:325 [inline]
 strlcpy include/linux/string.h:348 [inline]
 xt_rateest_tg_checkentry+0x2a5/0x6b0 net/netfilter/xt_RATEEST.c:143

strlcpy assumes src is a c-string. Check info->name before its used.

Reported-by: syzbot+e86f7c428c8c50db65b4@syzkaller.appspotmail.com
Fixes: 5859034d ("[NETFILTER]: x_tables: add RATEEST target")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

c5442287

netfilter: ipset: fix shift-out-of-bounds in htable_bits() · cdd0fc22

由 Vasily Averin 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit d17f2ccf6f995c57c25d9e7fb84edbe6e9472e96
bugzilla: 47429

--------------------------------

commit 5c8193f5 upstream.

htable_bits() can call jhash_size(32) and trigger shift-out-of-bounds

UBSAN: shift-out-of-bounds in net/netfilter/ipset/ip_set_hash_gen.h:151:6
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 0 PID: 8498 Comm: syz-executor519
 Not tainted 5.10.0-rc7-next-20201208-syzkaller #0
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 htable_bits net/netfilter/ipset/ip_set_hash_gen.h:151 [inline]
 hash_mac_create.cold+0x58/0x9b net/netfilter/ipset/ip_set_hash_gen.h:1524
 ip_set_create+0x610/0x1380 net/netfilter/ipset/ip_set_core.c:1115
 nfnetlink_rcv_msg+0xecc/0x1180 net/netfilter/nfnetlink.c:252
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
 nfnetlink_rcv+0x1ac/0x420 net/netfilter/nfnetlink.c:600
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x907/0xe40 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2345
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2399
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2432
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

This patch replaces htable_bits() by simple fls(hashsize - 1) call:
it alone returns valid nbits both for round and non-round hashsizes.
It is normal to set any nbits here because it is validated inside
following htable_size() call which returns 0 for nbits>31.

Fixes: 1feab10d("netfilter: ipset: Unified hash type generation")
Reported-by: syzbot+d66bfadebca46cf61a2b@syzkaller.appspotmail.com
Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
Acked-by: NJozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

cdd0fc22

netfilter: x_tables: Update remaining dereference to RCU · ad5f36bd

由 Subash Abhinov Kasiviswanathan 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 27bc60d9678a245bce000ba22824f91931fa24f9
bugzilla: 47429

--------------------------------

commit 443d6e86 upstream.

This fixes the dereference to fetch the RCU pointer when holding
the appropriate xtables lock.
Reported-by: Nkernel test robot <lkp@intel.com>
Fixes: cc00bcaa ("netfilter: x_tables: Switch synchronization to RCU")
Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Reviewed-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

ad5f36bd

ARM: dts: OMAP3: disable AES on N950/N9 · 2b9ed728

由 Aaro Koskinen 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 56429ddfd59c11caf15870971b2c782fae80e1dd
bugzilla: 47429

--------------------------------

commit f1dc15cd upstream.

AES needs to be disabled on Nokia N950/N9 as well (HS devices), otherwise
kernel fails to boot.

Fixes: c312f066 ("ARM: dts: omap3: Migrate AES from hwmods to sysc-omap2")
Signed-off-by: NAaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

2b9ed728

net/mlx5e: Fix SWP offsets when vlan inserted by driver · cd7a0119

由 Moshe Shemesh 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 00a6b090d5c9eab1faf01077bc39093032eaf482
bugzilla: 47429

--------------------------------

commit b544011f upstream.

In case WQE includes inline header the vlan is inserted by driver even
if vlan offload is set. On geneve over vlan interface where software
parser is used the SWP offsets should be updated according to the added
vlan.

Fixes: e3cfc7e6 ("net/mlx5e: TX, Add geneve tunnel stateless offload support")
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

cd7a0119

bcache: introduce BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE for large bucket · 8e3969d2

由 Coly Li 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit a3601005de8fe0b9485f5203ea4dd2fb5b08cafd
bugzilla: 47429

--------------------------------

commit b16671e8 upstream.

When large bucket feature was added, BCH_FEATURE_INCOMPAT_LARGE_BUCKET
was introduced into the incompat feature set. It used bucket_size_hi
(which was added at the tail of struct cache_sb_disk) to extend current
16bit bucket size to 32bit with existing bucket_size in struct
cache_sb_disk.

This is not a good idea, there are two obvious problems,
- Bucket size is always value power of 2, if store log2(bucket size) in
  existing bucket_size of struct cache_sb_disk, it is unnecessary to add
  bucket_size_hi.
- Macro csum_set() assumes d[SB_JOURNAL_BUCKETS] is the last member in
  struct cache_sb_disk, bucket_size_hi was added after d[] which makes
  csum_set calculate an unexpected super block checksum.

To fix the above problems, this patch introduces a new incompat feature
bit BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE, when this bit is set, it
means bucket_size in struct cache_sb_disk stores the order of power-of-2
bucket size value. When user specifies a bucket size larger than 32768
sectors, BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE will be set to
incompat feature set, and bucket_size stores log2(bucket size) more
than store the real bucket size value.

The obsoleted BCH_FEATURE_INCOMPAT_LARGE_BUCKET won't be used anymore,
it is renamed to BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET and still only
recognized by kernel driver for legacy compatible purpose. The previous
bucket_size_hi is renmaed to obso_bucket_size_hi in struct cache_sb_disk
and not used in bcache-tools anymore.

For cache device created with BCH_FEATURE_INCOMPAT_LARGE_BUCKET feature,
bcache-tools and kernel driver still recognize the feature string and
display it as "obso_large_bucket".

With this change, the unnecessary extra space extend of bcache on-disk
super block can be avoided, and csum_set() may generate expected check
sum as well.

Fixes: ffa47032 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Signed-off-by: NColy Li <colyli@suse.de>
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

8e3969d2

bcache: check unsupported feature sets for bcache register · 85aa1f11

由 Coly Li 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit a9c413cd0cdf8823e01b59779602c54bc847962b
bugzilla: 47429

--------------------------------

commit 1dfc0686 upstream.

This patch adds the check for features which is incompatible for
current supported feature sets.

Now if the bcache device created by bcache-tools has features that
current kernel doesn't support, read_super() will fail with error
messoage. E.g. if an unsupported incompatible feature detected,
bcache register will fail with dmesg "bcache: register_bcache() error :
Unsupported incompatible feature found".

Fixes: d721a43f ("bcache: increase super block version for cache device and backing device")
Fixes: ffa47032 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Signed-off-by: NColy Li <colyli@suse.de>
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

85aa1f11

bcache: fix typo from SUUP to SUPP in features.h · 72178142

由 Coly Li 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit fbb23cd187558a9f1256845ff9c8dd10dbeae101
bugzilla: 47429

--------------------------------

commit f7b4943d upstream.

This patch fixes the following typos,
from BCH_FEATURE_COMPAT_SUUP to BCH_FEATURE_COMPAT_SUPP
from BCH_FEATURE_INCOMPAT_SUUP to BCH_FEATURE_INCOMPAT_SUPP
from BCH_FEATURE_INCOMPAT_SUUP to BCH_FEATURE_RO_COMPAT_SUPP

Fixes: d721a43f ("bcache: increase super block version for cache device and backing device")
Fixes: ffa47032 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Signed-off-by: NColy Li <colyli@suse.de>
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

72178142

drm/i915: clear the gpu reloc batch · d191de32

由 Matthew Auld 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 36d366ace15444dd741d25d8142735da1dac2445
bugzilla: 47429

--------------------------------

commit 641382e9 upstream.

The reloc batch is short lived but can exist in the user visible ppGTT,
and since it's backed by an internal object, which lacks page clearing,
we should take care to clear it upfront.
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-2-matthew.auld@intel.com
Cc: stable@vger.kernel.org
(cherry picked from commit 26ebc511)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

d191de32

drm/i915: clear the shadow batch · 6c0bee6d

由 Matthew Auld 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 13738d7d5a24a9208241dfbc997b55fcbd18c64d
bugzilla: 47429

--------------------------------

commit 75353bcd upstream.

The shadow batch is an internal object, which doesn't have any page
clearing, and since the batch_len can be smaller than the object, we
should take care to clear it.

Testcase: igt/gen9_exec_parse/shadow-peek
Fixes: 4f7af194 ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers")
Signed-off-by: NMatthew Auld <matthew.auld@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-1-matthew.auld@intel.com
Cc: stable@vger.kernel.org
(cherry picked from commit eeb52ee6)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

6c0bee6d

arm64: link with -z norelro for LLD or aarch64-elf · 681ee1d4

由 Nick Desaulniers 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 1cd7e30a6db615b348f12c4c9fff323ef9a11d4a
bugzilla: 47429

--------------------------------

commit 311bea3c upstream.

With GNU binutils 2.35+, linking with BFD produces warnings for vmlinux:
aarch64-linux-gnu-ld: warning: -z norelro ignored

BFD can produce this warning when the target emulation mode does not
support RELRO program headers, and -z relro or -z norelro is passed.

Alan Modra clarifies:
  The default linker emulation for an aarch64-linux ld.bfd is
  -maarch64linux, the default for an aarch64-elf linker is
  -maarch64elf.  They are not equivalent.  If you choose -maarch64elf
  you get an emulation that doesn't support -z relro.

The ARCH=arm64 kernel prefers -maarch64elf, but may fall back to
-maarch64linux based on the toolchain configuration.

LLD will always create RELRO program header regardless of target
emulation.

To avoid the above warning when linking with BFD, pass -z norelro only
when linking with LLD or with -maarch64linux.

Fixes: 3b92fa74 ("arm64: link with -z norelro regardless of CONFIG_RELOCATABLE")
Fixes: 3bbd3db8 ("arm64: relocatable: fix inconsistencies in linker script and options")
Cc: <stable@vger.kernel.org> # 5.0.x-
Reported-by: Nkernelci.org bot <bot@kernelci.org>
Reported-by: NQuentin Perret <qperret@google.com>
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
Acked-by: NArd Biesheuvel <ardb@kernel.org>
Cc: Alan Modra <amodra@gmail.com>
Cc: Fāng-ruì Sòng <maskray@google.com>
Link: https://lore.kernel.org/r/20201218002432.788499-1-ndesaulniers@google.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

681ee1d4

dmabuf: fix use-after-free of dmabuf's file->f_inode · 16c2feda

由 Charan Teja Reddy 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit a19dae4254c434a1ac8937a809fe08fd15ad3be5
bugzilla: 47429

--------------------------------

commit 05cd8469 upstream.

It is observed 'use-after-free' on the dmabuf's file->f_inode with the
race between closing the dmabuf file and reading the dmabuf's debug
info.

Consider the below scenario where P1 is closing the dma_buf file
and P2 is reading the dma_buf's debug info in the system:

P1						P2
					dma_buf_debug_show()
dma_buf_put()
  __fput()
    file->f_op->release()
    dput()
    ....
      dentry_unlink_inode()
        iput(dentry->d_inode)
        (where the inode is freed)
					mutex_lock(&db_list.lock)
					read 'dma_buf->file->f_inode'
					(the same inode is freed by P1)
					mutex_unlock(&db_list.lock)
      dentry->d_op->d_release()-->
        dma_buf_release()
          .....
          mutex_lock(&db_list.lock)
          removes the dmabuf from the list
          mutex_unlock(&db_list.lock)

In the above scenario, when dma_buf_put() is called on a dma_buf, it
first frees the dma_buf's file->f_inode(=dentry->d_inode) and then
removes this dma_buf from the system db_list. In between P2 traversing
the db_list tries to access this dma_buf's file->f_inode that was freed
by P1 which is a use-after-free case.

Since, __fput() calls f_op->release first and then later calls the
d_op->d_release, move the dma_buf's db_list removal from d_release() to
f_op->release(). This ensures that dma_buf's file->f_inode is not
accessed after it is released.

Cc: <stable@vger.kernel.org> # 5.4.x-
Fixes: 4ab59c3c ("dma-buf: Move dma_buf_release() from fops to dentry_ops")
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NCharan Teja Reddy <charante@codeaurora.org>
Signed-off-by: NSumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/1609857399-31549-1-git-send-email-charante@codeaurora.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

16c2feda

Revert "device property: Keep secondary firmware node secondary by type" · 782d9dbc

由 Bard Liao 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 6844bc38c9fe5c20994f4a0819eac1fc9acd80eb
bugzilla: 47429

--------------------------------

commit 47f44699 upstream.

While commit d5dcce0c ("device property: Keep secondary firmware
node secondary by type") describes everything correct in its commit
message, the change it made does the opposite and original commit
c15e1bdd ("device property: Fix the secondary firmware node handling
in set_primary_fwnode()") was fully correct.

Revert the former one here and improve documentation in the next patch.

Fixes: d5dcce0c ("device property: Keep secondary firmware node secondary by type")
Signed-off-by: NBard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: NHeikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

782d9dbc

btrfs: send: fix wrong file path when there is an inode with a pending rmdir · e3bc7d18

由 Filipe Manana 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 5e84c99055eb84a2a6226bf6164ee70bdcfb996e
bugzilla: 47429

--------------------------------

commit 0b3f407e upstream.

When doing an incremental send, if we have a new inode that happens to
have the same number that an old directory inode had in the base snapshot
and that old directory has a pending rmdir operation, we end up computing
a wrong path for the new inode, causing the receiver to fail.

Example reproducer:

  $ cat test-send-rmdir.sh
  #!/bin/bash

  DEV=/dev/sdi
  MNT=/mnt/sdi

  mkfs.btrfs -f $DEV >/dev/null
  mount $DEV $MNT

  mkdir $MNT/dir
  touch $MNT/dir/file1
  touch $MNT/dir/file2
  touch $MNT/dir/file3

  # Filesystem looks like:
  #
  # .                                     (ino 256)
  # |----- dir/                           (ino 257)
  #         |----- file1                  (ino 258)
  #         |----- file2                  (ino 259)
  #         |----- file3                  (ino 260)
  #

  btrfs subvolume snapshot -r $MNT $MNT/snap1
  btrfs send -f /tmp/snap1.send $MNT/snap1

  # Now remove our directory and all its files.
  rm -fr $MNT/dir

  # Unmount the filesystem and mount it again. This is to ensure that
  # the next inode that is created ends up with the same inode number
  # that our directory "dir" had, 257, which is the first free "objectid"
  # available after mounting again the filesystem.
  umount $MNT
  mount $DEV $MNT

  # Now create a new file (it could be a directory as well).
  touch $MNT/newfile

  # Filesystem now looks like:
  #
  # .                                     (ino 256)
  # |----- newfile                        (ino 257)
  #

  btrfs subvolume snapshot -r $MNT $MNT/snap2
  btrfs send -f /tmp/snap2.send -p $MNT/snap1 $MNT/snap2

  # Now unmount the filesystem, create a new one, mount it and try to apply
  # both send streams to recreate both snapshots.
  umount $DEV

  mkfs.btrfs -f $DEV >/dev/null

  mount $DEV $MNT

  btrfs receive -f /tmp/snap1.send $MNT
  btrfs receive -f /tmp/snap2.send $MNT

  umount $MNT

When running the test, the receive operation for the incremental stream
fails:

  $ ./test-send-rmdir.sh
  Create a readonly snapshot of '/mnt/sdi' in '/mnt/sdi/snap1'
  At subvol /mnt/sdi/snap1
  Create a readonly snapshot of '/mnt/sdi' in '/mnt/sdi/snap2'
  At subvol /mnt/sdi/snap2
  At subvol snap1
  At snapshot snap2
  ERROR: chown o257-9-0 failed: No such file or directory

So fix this by tracking directories that have a pending rmdir by inode
number and generation number, instead of only inode number.

A test case for fstests follows soon.
Reported-by: NMassimo B. <massimo.b@gmx.net>
Tested-by: NMassimo B. <massimo.b@gmx.net>
Link: https://lore.kernel.org/linux-btrfs/6ae34776e85912960a253a8327068a892998e685.camel@gmx.net/
CC: stable@vger.kernel.org # 4.19+
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

e3bc7d18

btrfs: qgroup: don't try to wait flushing if we're already holding a transaction · 0f65f273

由 Qu Wenruo 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 1888e5df8449ef16655f827bd46d0a809b3048a4
bugzilla: 47429

--------------------------------

commit ae5e070e upstream.

There is a chance of racing for qgroup flushing which may lead to
deadlock:

	Thread A		|	Thread B
   (not holding trans handle)	|  (holding a trans handle)
--------------------------------+--------------------------------
__btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
|- try_flush_qgroup()		| |- try_flush_qgroup()
   |- QGROUP_FLUSHING bit set   |    |
   |				|    |- test_and_set_bit()
   |				|    |- wait_event()
   |- btrfs_join_transaction()	|
   |- btrfs_commit_transaction()|

			!!! DEAD LOCK !!!

Since thread A wants to commit transaction, but thread B is holding a
transaction handle, blocking the commit.
At the same time, thread B is waiting for thread A to finish its commit.

This is just a hot fix, and would lead to more EDQUOT when we're near
the qgroup limit.

The proper fix would be to make all metadata/data reservations happen
without holding a transaction handle.

CC: stable@vger.kernel.org # 5.9+
Reviewed-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NQu Wenruo <wqu@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

0f65f273

iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev · 62f7f7ab

由 Liu Yi L 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 1c31964eca1397b923ff388866c67a25dc24b0da
bugzilla: 47429

--------------------------------

commit 9ad9f45b upstream.

'struct intel_svm' is shared by all devices bound to a give process,
but records only a single pointer to a 'struct intel_iommu'. Consequently,
cache invalidations may only be applied to a single DMAR unit, and are
erroneously skipped for the other devices.

In preparation for fixing this, rework the structures so that the iommu
pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm'
to track them in its device list.

Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Raj Ashok <ashok.raj@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Reported-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Reported-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Tested-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Cc: stable@vger.kernel.org # v5.0+
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.comSigned-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

62f7f7ab

ALSA: hda/realtek: Add two "Intel Reference board" SSID in the ALC256. · 03b34ac3

由 PeiSen Hou 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit a07c54917aad750feb4689972386b2afc19f29c8
bugzilla: 47429

--------------------------------

commit ce2e79b2 upstream.

Add two "Intel Reference boad" SSID in the alc256.
Enable "power saving mode" and Enable "headset jack mode".
Signed-off-by: NPeiSen Hou <pshou@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/5978d2267f034c28973d117925ec9c63@realtek.comSigned-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

03b34ac3

ALSA: hda/realtek: Enable mute and micmute LED on HP EliteBook 850 G7 · 81cab247

由 Kai-Heng Feng 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 41af04d3037a03d68b110ccaf0b6b9d4a850da49
bugzilla: 47429

--------------------------------

commit a598098c upstream.

HP EliteBook 850 G7 uses the same GPIO pins as ALC285_FIXUP_HP_GPIO_LED
to enable mute and micmute LED. So apply the quirk to enable the LEDs.
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201230125636.45028-1-kai.heng.feng@canonical.comSigned-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

81cab247

ALSA: hda/realtek: Add mute LED quirk for more HP laptops · f2303fd5

由 Manuel Jiménez 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 3e1bcaebe8b10aedd94907473fef7d6a04d42ac0
bugzilla: 47429

--------------------------------

commit 48422958 upstream.

HP Pavilion 13-bb0000 (SSID 103c:87c8) needs the same
quirk as other models with ALC287.
Signed-off-by: NManuel Jiménez <mjbfm99@me.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/X+s/gKNydVrI6nLj@HP-Pavilion-13Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

f2303fd5

ALSA: hda/realtek - Fix speaker volume control on Lenovo C940 · 9edbd9f7

由 Kailang Yang 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 582de98b59fd2f05c3f6239184a67dec7f374be8
bugzilla: 47429

--------------------------------

commit f86de9b1 upstream.

Cannot adjust speaker's volume on Lenovo C940.
Applying the alc298_fixup_speaker_volume function can fix the issue.

[ Additional note: C940 has I2S amp for the speaker and this needs the
  same initialization as Dell machines.
  The patch was slightly modified so that the quirk entry is moved
  next to the corresponding Dell quirk entry. -- tiwai ]
Signed-off-by: NKailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/ea25b4e5c468491aa2e9d6cb1f2fced3@realtek.comSigned-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

9edbd9f7

ALSA: hda/conexant: add a new hda codec CX11970 · a3d97c35

由 bo liu 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 2eda063db9922627ac5501d0187bf92cfc9065a1
bugzilla: 47429

--------------------------------

commit 744a11ab upstream.

The current kernel does not support the cx11970 codec chip.
Add a codec configuration item to kernel.

[ Minor coding style fix by tiwai ]
Signed-off-by: Nbo liu <bo.liu@senarytech.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201229035226.62120-1-bo.liu@senarytech.comSigned-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

a3d97c35

ALSA: hda/via: Fix runtime PM for Clevo W35xSS · 1df50ed4

由 Takashi Iwai 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit c03f37d5293402f0518860456ee6fe2098f6b637
bugzilla: 47429

--------------------------------

commit 4bfd6247 upstream.

Clevo W35xSS_370SS with VIA codec has had the runtime PM problem that
looses the power state of some nodes after the runtime resume.  This
was worked around by disabling the default runtime PM via a denylist
entry.  Since 5.10.x made the runtime PM applied (casually) even
though it's disabled in the denylist, this problem was revisited.  The
result was that disabling power_save_node feature suffices for the
runtime PM problem.

This patch implements the disablement of power_save_node feature in
VIA codec for the device.  It also drops the former denylist entry,
too, as the runtime PM should work in the codec side properly now.

Fixes: b529ef24 ("ALSA: hda: Add Clevo W35xSS_370SS to the power_save blacklist")
Reported-by: NChristian Labisch <clnetbox@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210104153046.19993-1-tiwai@suse.deSigned-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

1df50ed4

blk-iocost: fix NULL iocg deref from racing against initialization · 69bbba2e

由 Tejun Heo 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit cafc6e70a63c5ca30b1cc9ae1bb492fcc54bfd62
bugzilla: 47429

--------------------------------

commit d16baa3f upstream.

When initializing iocost for a queue, its rqos should be registered before
the blkcg policy is activated to allow policy data initiailization to lookup
the associated ioc. This unfortunately means that the rqos methods can be
called on bios before iocgs are attached to all existing blkgs.

While the race is theoretically possible on ioc_rqos_throttle(), it mostly
happened in ioc_rqos_merge() due to the difference in how they lookup ioc.
The former determines it from the passed in @rqos and then bails before
dereferencing iocg if the looked up ioc is disabled, which most likely is
the case if initialization is still in progress. The latter looked up ioc by
dereferencing the possibly NULL iocg making it a lot more prone to actually
triggering the bug.

* Make ioc_rqos_merge() use the same method as ioc_rqos_throttle() to look
  up ioc for consistency.

* Make ioc_rqos_throttle() and ioc_rqos_merge() test for NULL iocg before
  dereferencing it.

* Explain the danger of NULL iocgs in blk_iocost_init().
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NJonathan Lemon <bsd@fb.com>
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

69bbba2e

x86/resctrl: Don't move a task to the same resource group · 42a6e19f

由 Fenghua Yu 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 397e352ca96f3c4d79cba9ce89ea8a8852860b86
bugzilla: 47429

--------------------------------

commit a0195f31 upstream.

Shakeel Butt reported in [1] that a user can request a task to be moved
to a resource group even if the task is already in the group. It just
wastes time to do the move operation which could be costly to send IPI
to a different CPU.

Add a sanity check to ensure that the move operation only happens when
the task is not already in the resource group.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/

Fixes: e02737d5 ("x86/intel_rdt: Add tasks files")
Reported-by: NShakeel Butt <shakeelb@google.com>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Signed-off-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/962ede65d8e95be793cb61102cca37f7bb018e66.1608243147.git.reinette.chatre@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

42a6e19f

x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR · 7ebf9c25

由 Fenghua Yu 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 34e4ae4dca72fd0fd9cf6cccc260db1a12ed5a69
bugzilla: 47429

--------------------------------

commit ae28d1aa upstream.

Currently, when moving a task to a resource group the PQR_ASSOC MSR is
updated with the new closid and rmid in an added task callback. If the
task is running, the work is run as soon as possible. If the task is not
running, the work is executed later in the kernel exit path when the
kernel returns to the task again.

Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task
is running is the right thing to do. Queueing work for a task that is
not running is unnecessary (the PQR_ASSOC MSR is already updated when
the task is scheduled in) and causing system resource waste with the way
in which it is implemented: Work to update the PQR_ASSOC register is
queued every time the user writes a task id to the "tasks" file, even if
the task already belongs to the resource group.

This could result in multiple pending work items associated with a
single task even if they are all identical and even though only a single
update with most recent values is needed. Specifically, even if a task
is moved between different resource groups while it is sleeping then it
is only the last move that is relevant but yet a work item is queued
during each move.

This unnecessary queueing of work items could result in significant
system resource waste, especially on tasks sleeping for a long time.
For example, as demonstrated by Shakeel Butt in [1] writing the same
task id to the "tasks" file can quickly consume significant memory. The
same problem (wasted system resources) occurs when moving a task between
different resource groups.

As pointed out by Valentin Schneider in [2] there is an additional issue
with the way in which the queueing of work is done in that the task_struct
update is currently done after the work is queued, resulting in a race with
the register update possibly done before the data needed by the update is
available.

To solve these issues, update the PQR_ASSOC MSR in a synchronous way
right after the new closid and rmid are ready during the task movement,
only if the task is running. If a moved task is not running nothing
is done since the PQR_ASSOC MSR will be updated next time the task is
scheduled. This is the same way used to update the register when tasks
are moved as part of resource group removal.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/
[2] https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schneider@arm.com

 [ bp: Massage commit message and drop the two update_task_closid_rmid()
   variants. ]

Fixes: e02737d5 ("x86/intel_rdt: Add tasks files")
Reported-by: NShakeel Butt <shakeelb@google.com>
Reported-by: NValentin Schneider <valentin.schneider@arm.com>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Signed-off-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NValentin Schneider <valentin.schneider@arm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.chatre@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

7ebf9c25

KVM: x86/mmu: Ensure TDP MMU roots are freed after yield · 1b07fe62

由 Ben Gardon 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit c3cf9ffe8d9c06269b2051c38f91d11ab16f8e4d
bugzilla: 47429

--------------------------------

commit a889ea54 upstream.

Many TDP MMU functions which need to perform some action on all TDP MMU
roots hold a reference on that root so that they can safely drop the MMU
lock in order to yield to other threads. However, when releasing the
reference on the root, there is a bug: the root will not be freed even
if its reference count (root_count) is reduced to 0.

To simplify acquiring and releasing references on TDP MMU root pages, and
to ensure that these roots are properly freed, move the get/put operations
into another TDP MMU root iterator macro.

Moving the get/put operations into an iterator macro also helps
simplify control flow when a root does need to be freed. Note that using
the list_for_each_entry_safe macro would not have been appropriate in
this situation because it could keep a pointer to the next root across
an MMU lock release + reacquire, during which time that root could be
freed.
Reported-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Fixes: faaf05b0 ("kvm: x86/mmu: Support zapping SPTEs in the TDP MMU")
Fixes: 063afacd ("kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU")
Fixes: a6a0b05d ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Fixes: 14881998 ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU")
Signed-off-by: NBen Gardon <bgardon@google.com>
Message-Id: <20210107001935.3732070-1-bgardon@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

1b07fe62

kvm: check tlbs_dirty directly · d96db27d

由 Lai Jiangshan 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit ffee6772c489d8d65d86979d4ccc4286624124b2
bugzilla: 47429

--------------------------------

commit 88bf56d0 upstream.

In kvm_mmu_notifier_invalidate_range_start(), tlbs_dirty is used as:
        need_tlb_flush |= kvm->tlbs_dirty;
with need_tlb_flush's type being int and tlbs_dirty's type being long.

It means that tlbs_dirty is always used as int and the higher 32 bits
is useless.  We need to check tlbs_dirty in a correct way and this
change checks it directly without propagating it to need_tlb_flush.

Note: it's _extremely_ unlikely this neglecting of higher 32 bits can
cause problems in practice.  It would require encountering tlbs_dirty
on a 4 billion count boundary, and KVM would need to be using shadow
paging or be running a nested guest.

Cc: stable@vger.kernel.org
Fixes: a4ee1ca4 ("KVM: MMU: delay flush all tlbs on sync_page path")
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20201217154118.16497-1-jiangshanlai@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

d96db27d

KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE · 402afa9f

由 Sean Christopherson 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit f4064ef40c5c31134d6c360a1f1e9ec64e545ede
bugzilla: 47429

--------------------------------

commit 39b4d43e upstream.

Get the so called "root" level from the low level shadow page table
walkers instead of manually attempting to calculate it higher up the
stack, e.g. in get_mmio_spte().  When KVM is using PAE shadow paging,
the starting level of the walk, from the callers perspective, is not
the CR3 root but rather the PDPTR "root".  Checking for reserved bits
from the CR3 root causes get_mmio_spte() to consume uninitialized stack
data due to indexing into sptes[] for a level that was not filled by
get_walk().  This can result in false positives and/or negatives
depending on what garbage happens to be on the stack.

Opportunistically nuke a few extra newlines.

Fixes: 95fb5b02 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
Reported-by: NRichard Herbert <rherbert@sympatico.ca>
Cc: Ben Gardon <bgardon@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20201218003139.2167891-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

402afa9f

KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte() · b0d6461b

由 Sean Christopherson 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit afd621673f03c0eee077288ee984c2ec397e3a85
bugzilla: 47429

--------------------------------

commit 2aa07893 upstream.

Return -1 from the get_walk() helpers if the shadow walk doesn't fill at
least one spte, which can theoretically happen if the walk hits a
not-present PDPTR.  Returning the root level in such a case will cause
get_mmio_spte() to return garbage (uninitialized stack data).  In
practice, such a scenario should be impossible as KVM shouldn't get a
reserved-bit page fault with a not-present PDPTR.

Note, using mmu->root_level in get_walk() is wrong for other reasons,
too, but that's now a moot point.

Fixes: 95fb5b02 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
Cc: Ben Gardon <bgardon@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20201218003139.2167891-2-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

b0d6461b

x86/mm: Fix leak of pmd ptlock · a7367de4

由 Dan Williams 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 23220e87c91f9975c45290d167b7ee3d415985dc
bugzilla: 47429

--------------------------------

commit d1c5246e upstream.

Commit

  28ee90fe ("x86/mm: implement free pmd/pte page interfaces")

introduced a new location where a pmd was released, but neglected to
run the pmd page destructor. In fact, this happened previously for a
different pmd release path and was fixed by commit:

  c283610e ("x86, mm: do not leak page->ptl for pmd page tables").

This issue was hidden until recently because the failure mode is silent,
but commit:

  b2b29d6d ("mm: account PMD tables like PTE tables")

turns the failure mode into this signature:

 BUG: Bad page state in process lt-pmem-ns  pfn:15943d
 page:000000007262ed7b refcount:0 mapcount:-1024 mapping:0000000000000000 index:0x0 pfn:0x15943d
 flags: 0xaffff800000000()
 raw: 00affff800000000 dead000000000100 0000000000000000 0000000000000000
 raw: 0000000000000000 ffff913a029bcc08 00000000fffffbff 0000000000000000
 page dumped because: nonzero mapcount
 [..]
  dump_stack+0x8b/0xb0
  bad_page.cold+0x63/0x94
  free_pcp_prepare+0x224/0x270
  free_unref_page+0x18/0xd0
  pud_free_pmd_page+0x146/0x160
  ioremap_pud_range+0xe3/0x350
  ioremap_page_range+0x108/0x160
  __ioremap_caller.constprop.0+0x174/0x2b0
  ? memremap+0x7a/0x110
  memremap+0x7a/0x110
  devm_memremap+0x53/0xa0
  pmem_attach_disk+0x4ed/0x530 [nd_pmem]
  ? __devm_release_region+0x52/0x80
  nvdimm_bus_probe+0x85/0x210 [libnvdimm]

Given this is a repeat occurrence it seemed prudent to look for other
places where this destructor might be missing and whether a better
helper is needed. try_to_free_pmd_page() looks like a candidate, but
testing with setting up and tearing down pmd mappings via the dax unit
tests is thus far not triggering the failure.

As for a better helper pmd_free() is close, but it is a messy fit
due to requiring an @mm arg. Also, ___pmd_free_tlb() wants to call
paravirt_tlb_remove_table() instead of free_page(), so open-coded
pgtable_pmd_page_dtor() seems the best way forward for now.

Debugged together with Matthew Wilcox <willy@infradead.org>.

Fixes: 28ee90fe ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NYi Zhang <yi.zhang@redhat.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/160697689204.605323.17629854984697045602.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

a7367de4

mm: make wait_on_page_writeback() wait for multiple pending writebacks · 7b73bf0a

由 Linus Torvalds 提交于 1月 19, 2021

stable inclusion
from stable-5.10.7
commit 876195e1c8c6dcd580b648f0a691c93b86ec2042
bugzilla: 47429

--------------------------------

commit c2407cf7 upstream.

Ever since commit 2a9127fc ("mm: rewrite wait_on_page_bit_common()
logic") we've had some very occasional reports of BUG_ON(PageWriteback)
in write_cache_pages(), which we thought we already fixed in commit
073861ed ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)").

But syzbot just reported another one, even with that commit in place.

And it turns out that there's a simpler way to trigger the BUG_ON() than
the one Hugh found with page re-use.  It all boils down to the fact that
the page writeback is ostensibly serialized by the page lock, but that
isn't actually really true.

Yes, the people _setting_ writeback all do so under the page lock, but
the actual clearing of the bit - and waking up any waiters - happens
without any page lock.

This gives us this fairly simple race condition:

  CPU1 = end previous writeback
  CPU2 = start new writeback under page lock
  CPU3 = write_cache_pages()

  CPU1          CPU2            CPU3
  ----          ----            ----

  end_page_writeback()
    test_clear_page_writeback(page)
    ... delayed...

                lock_page();
                set_page_writeback()
                unlock_page()

                                lock_page()
                                wait_on_page_writeback();

    wake_up_page(page, PG_writeback);
    .. wakes up CPU3 ..

                                BUG_ON(PageWriteback(page));

where the BUG_ON() happens because we woke up the PG_writeback bit
becasue of the _previous_ writeback, but a new one had already been
started because the clearing of the bit wasn't actually atomic wrt the
actual wakeup or serialized by the page lock.

The reason this didn't use to happen was that the old logic in waiting
on a page bit would just loop if it ever saw the bit set again.

The nice proper fix would probably be to get rid of the whole "wait for
writeback to clear, and then set it" logic in the writeback path, and
replace it with an atomic "wait-to-set" (ie the same as we have for page
locking: we set the page lock bit with a single "lock_page()", not with
"wait for lock bit to clear and then set it").

However, out current model for writeback is that the waiting for the
writeback bit is done by the generic VFS code (ie write_cache_pages()),
but the actual setting of the writeback bit is done much later by the
filesystem ".writepages()" function.

IOW, to make the writeback bit have that same kind of "wait-to-set"
behavior as we have for page locking, we'd have to change our roughly
~50 different writeback functions.  Painful.

Instead, just make "wait_on_page_writeback()" loop on the very unlikely
situation that the PG_writeback bit is still set, basically re-instating
the old behavior.  This is very non-optimal in case of contention, but
since we only ever set the bit under the page lock, that situation is
controlled.

Reported-by: syzbot+2fc0712f8f8b8b8fa0ef@syzkaller.appspotmail.com
Fixes: 2a9127fc ("mm: rewrite wait_on_page_bit_common() logic")
Acked-by: NHugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

7b73bf0a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功