- 27 1月, 2021 40 次提交
-
-
由 Jouni K. Seppänen 提交于
stable inclusion from stable-5.10.8 commit b044a949a5c5ddbe61a806bba44aab6148a6f356 bugzilla: 47450 -------------------------------- [ Upstream commit 7a68d725 ] Aligning to tx_ndp_modulus is not sufficient because the next align call can be cdc_ncm_align_tail, which can add up to ctx->tx_modulus + ctx->tx_remainder - 1 bytes. This used to lead to occasional crashes on a Huawei 909s-120 LTE module as follows: - the condition marked /* if there is a remaining skb [...] */ is true so the swaps happen - skb_out is set from ctx->tx_curr_skb - skb_out->len is exactly 0x3f52 - ctx->tx_curr_size is 0x4000 and delayed_ndp_size is 0xac (note that the sum of skb_out->len and delayed_ndp_size is 0x3ffe) - the for loop over n is executed once - the cdc_ncm_align_tail call marked /* align beginning of next frame */ increases skb_out->len to 0x3f56 (the sum is now 0x4002) - the condition marked /* check if we had enough room left [...] */ is false so we break out of the loop - the condition marked /* If requested, put NDP at end of frame. */ is true so the NDP is written into skb_out - now skb_out->len is 0x4002, so padding_count is minus two interpreted as an unsigned number, which is used as the length argument to memset, leading to a crash with various symptoms but usually including > Call Trace: > <IRQ> > cdc_ncm_fill_tx_frame+0x83a/0x970 [cdc_ncm] > cdc_mbim_tx_fixup+0x1d9/0x240 [cdc_mbim] > usbnet_start_xmit+0x5d/0x720 [usbnet] The cdc_ncm_align_tail call first aligns on a ctx->tx_modulus boundary (adding at most ctx->tx_modulus-1 bytes), then adds ctx->tx_remainder bytes. Alternatively, the next alignment call can occur in cdc_ncm_ndp16 or cdc_ncm_ndp32, in which case at most ctx->tx_ndp_modulus-1 bytes are added. A similar problem has occurred before, and the code is nontrivial to reason about, so add a guard before the crashing call. By that time it is too late to prevent any memory corruption (we'll have written past the end of the buffer already) but we can at least try to get a warning written into an on-disk log by avoiding the hard crash caused by padding past the buffer with a huge number of zeros. Signed-off-by: NJouni K. Seppänen <jks@iki.fi> Fixes: 4a0e3e98 ("cdc_ncm: Add support for moving NDP to end of NCM frame") Link: https://bugzilla.kernel.org/show_bug.cgi?id=209407Reported-by: Nkernel test robot <lkp@intel.com> Reviewed-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Josef Bacik 提交于
stable inclusion from stable-5.10.8 commit e3b5252b5cdb4458527aa2356277700d21bf625f bugzilla: 47450 -------------------------------- [ Upstream commit e076ab2a ] Commit 38d715f4 ("btrfs: use btrfs_start_delalloc_roots in shrink_delalloc") cleaned up how we do delalloc shrinking by utilizing some infrastructure we have in place to flush inodes that we use for device replace and snapshot. However this introduced a pretty serious performance regression. To reproduce the user untarred the source tarball of Firefox (360MiB xz compressed/1.5GiB uncompressed), and would see it take anywhere from 5 to 20 times as long to untar in 5.10 compared to 5.9. This was observed on fast devices (SSD and better) and not on HDD. The root cause is because before we would generally use the normal writeback path to reclaim delalloc space, and for this we would provide it with the number of pages we wanted to flush. The referenced commit changed this to flush that many inodes, which drastically increased the amount of space we were flushing in certain cases, which severely affected performance. We cannot revert this patch unfortunately because of 3d45f221 ("btrfs: fix deadlock when cloning inline extent and low on free metadata space") which requires the ability to skip flushing inodes that are being cloned in certain scenarios, which means we need to keep using our flushing infrastructure or risk re-introducing the deadlock. Instead to fix this problem we can go back to providing btrfs_start_delalloc_roots with a number of pages to flush, and then set up a writeback_control and utilize sync_inode() to handle the flushing for us. This gives us the same behavior we had prior to the fix, while still allowing us to avoid the deadlock that was fixed by Filipe. I redid the users original test and got the following results on one of our test machines (256GiB of ram, 56 cores, 2TiB Intel NVMe drive) 5.9 0m54.258s 5.10 1m26.212s 5.10+patch 0m38.800s 5.10+patch is significantly faster than plain 5.9 because of my patch series "Change data reservations to use the ticketing infra" which contained the patch that introduced the regression, but generally improved the overall ENOSPC flushing mechanisms. Additional testing on consumer-grade SSD (8GiB ram, 8 CPU) confirm the results: 5.10.5 4m00s 5.10.5+patch 1m08s 5.11-rc2 5m14s 5.11-rc2+patch 1m30s Reported-by: NRené Rebe <rene@exactcode.de> Fixes: 38d715f4 ("btrfs: use btrfs_start_delalloc_roots in shrink_delalloc") CC: stable@vger.kernel.org # 5.10 Signed-off-by: NJosef Bacik <josef@toxicpanda.com> Tested-by: NDavid Sterba <dsterba@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> [ add my test results ] Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Filipe Manana 提交于
stable inclusion from stable-5.10.8 commit 17243f73ad742363721e1288fb74e7b151c801f7 bugzilla: 47450 -------------------------------- [ Upstream commit 3d45f221 ] When cloning an inline extent there are cases where we can not just copy the inline extent from the source range to the target range (e.g. when the target range starts at an offset greater than zero). In such cases we copy the inline extent's data into a page of the destination inode and then dirty that page. However, after that we will need to start a transaction for each processed extent and, if we are ever low on available metadata space, we may need to flush existing delalloc for all dirty inodes in an attempt to release metadata space - if that happens we may deadlock: * the async reclaim task queued a delalloc work to flush delalloc for the destination inode of the clone operation; * the task executing that delalloc work gets blocked waiting for the range with the dirty page to be unlocked, which is currently locked by the task doing the clone operation; * the async reclaim task blocks waiting for the delalloc work to complete; * the cloning task is waiting on the waitqueue of its reservation ticket while holding the range with the dirty page locked in the inode's io_tree; * if metadata space is not released by some other task (like delalloc for some other inode completing for example), the clone task waits forever and as a consequence the delalloc work and async reclaim tasks will hang forever as well. Releasing more space on the other hand may require starting a transaction, which will hang as well when trying to reserve metadata space, resulting in a deadlock between all these tasks. When this happens, traces like the following show up in dmesg/syslog: [87452.323003] INFO: task kworker/u16:11:1810830 blocked for more than 120 seconds. [87452.323644] Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1 [87452.324248] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [87452.324852] task:kworker/u16:11 state:D stack: 0 pid:1810830 ppid: 2 flags:0x00004000 [87452.325520] Workqueue: btrfs-flush_delalloc btrfs_work_helper [btrfs] [87452.326136] Call Trace: [87452.326737] __schedule+0x5d1/0xcf0 [87452.327390] schedule+0x45/0xe0 [87452.328174] lock_extent_bits+0x1e6/0x2d0 [btrfs] [87452.328894] ? finish_wait+0x90/0x90 [87452.329474] btrfs_invalidatepage+0x32c/0x390 [btrfs] [87452.330133] ? __mod_memcg_state+0x8e/0x160 [87452.330738] __extent_writepage+0x2d4/0x400 [btrfs] [87452.331405] extent_write_cache_pages+0x2b2/0x500 [btrfs] [87452.332007] ? lock_release+0x20e/0x4c0 [87452.332557] ? trace_hardirqs_on+0x1b/0xf0 [87452.333127] extent_writepages+0x43/0x90 [btrfs] [87452.333653] ? lock_acquire+0x1a3/0x490 [87452.334177] do_writepages+0x43/0xe0 [87452.334699] ? __filemap_fdatawrite_range+0xa4/0x100 [87452.335720] __filemap_fdatawrite_range+0xc5/0x100 [87452.336500] btrfs_run_delalloc_work+0x17/0x40 [btrfs] [87452.337216] btrfs_work_helper+0xf1/0x600 [btrfs] [87452.337838] process_one_work+0x24e/0x5e0 [87452.338437] worker_thread+0x50/0x3b0 [87452.339137] ? process_one_work+0x5e0/0x5e0 [87452.339884] kthread+0x153/0x170 [87452.340507] ? kthread_mod_delayed_work+0xc0/0xc0 [87452.341153] ret_from_fork+0x22/0x30 [87452.341806] INFO: task kworker/u16:1:2426217 blocked for more than 120 seconds. [87452.342487] Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1 [87452.343274] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [87452.344049] task:kworker/u16:1 state:D stack: 0 pid:2426217 ppid: 2 flags:0x00004000 [87452.344974] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs] [87452.345655] Call Trace: [87452.346305] __schedule+0x5d1/0xcf0 [87452.346947] ? kvm_clock_read+0x14/0x30 [87452.347676] ? wait_for_completion+0x81/0x110 [87452.348389] schedule+0x45/0xe0 [87452.349077] schedule_timeout+0x30c/0x580 [87452.349718] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [87452.350340] ? lock_acquire+0x1a3/0x490 [87452.351006] ? try_to_wake_up+0x7a/0xa20 [87452.351541] ? lock_release+0x20e/0x4c0 [87452.352040] ? lock_acquired+0x199/0x490 [87452.352517] ? wait_for_completion+0x81/0x110 [87452.353000] wait_for_completion+0xab/0x110 [87452.353490] start_delalloc_inodes+0x2af/0x390 [btrfs] [87452.353973] btrfs_start_delalloc_roots+0x12d/0x250 [btrfs] [87452.354455] flush_space+0x24f/0x660 [btrfs] [87452.355063] btrfs_async_reclaim_metadata_space+0x1bb/0x480 [btrfs] [87452.355565] process_one_work+0x24e/0x5e0 [87452.356024] worker_thread+0x20f/0x3b0 [87452.356487] ? process_one_work+0x5e0/0x5e0 [87452.356973] kthread+0x153/0x170 [87452.357434] ? kthread_mod_delayed_work+0xc0/0xc0 [87452.357880] ret_from_fork+0x22/0x30 (...) < stack traces of several tasks waiting for the locks of the inodes of the clone operation > (...) [92867.444138] RSP: 002b:00007ffc3371bbe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000052 [92867.444624] RAX: ffffffffffffffda RBX: 00007ffc3371bea0 RCX: 00007f61efe73f97 [92867.445116] RDX: 0000000000000000 RSI: 0000560fbd5d7a40 RDI: 0000560fbd5d8960 [92867.445595] RBP: 00007ffc3371beb0 R08: 0000000000000001 R09: 0000000000000003 [92867.446070] R10: 00007ffc3371b996 R11: 0000000000000246 R12: 0000000000000000 [92867.446820] R13: 000000000000001f R14: 00007ffc3371bea0 R15: 00007ffc3371beb0 [92867.447361] task:fsstress state:D stack: 0 pid:2508238 ppid:2508153 flags:0x00004000 [92867.447920] Call Trace: [92867.448435] __schedule+0x5d1/0xcf0 [92867.448934] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [92867.449423] schedule+0x45/0xe0 [92867.449916] __reserve_bytes+0x4a4/0xb10 [btrfs] [92867.450576] ? finish_wait+0x90/0x90 [92867.451202] btrfs_reserve_metadata_bytes+0x29/0x190 [btrfs] [92867.451815] btrfs_block_rsv_add+0x1f/0x50 [btrfs] [92867.452412] start_transaction+0x2d1/0x760 [btrfs] [92867.453216] clone_copy_inline_extent+0x333/0x490 [btrfs] [92867.453848] ? lock_release+0x20e/0x4c0 [92867.454539] ? btrfs_search_slot+0x9a7/0xc30 [btrfs] [92867.455218] btrfs_clone+0x569/0x7e0 [btrfs] [92867.455952] btrfs_clone_files+0xf6/0x150 [btrfs] [92867.456588] btrfs_remap_file_range+0x324/0x3d0 [btrfs] [92867.457213] do_clone_file_range+0xd4/0x1f0 [92867.457828] vfs_clone_file_range+0x4d/0x230 [92867.458355] ? lock_release+0x20e/0x4c0 [92867.458890] ioctl_file_clone+0x8f/0xc0 [92867.459377] do_vfs_ioctl+0x342/0x750 [92867.459913] __x64_sys_ioctl+0x62/0xb0 [92867.460377] do_syscall_64+0x33/0x80 [92867.460842] entry_SYSCALL_64_after_hwframe+0x44/0xa9 (...) < stack traces of more tasks blocked on metadata reservation like the clone task above, because the async reclaim task has deadlocked > (...) Another thing to notice is that the worker task that is deadlocked when trying to flush the destination inode of the clone operation is at btrfs_invalidatepage(). This is simply because the clone operation has a destination offset greater than the i_size and we only update the i_size of the destination file after cloning an extent (just like we do in the buffered write path). Since the async reclaim path uses btrfs_start_delalloc_roots() to trigger the flushing of delalloc for all inodes that have delalloc, add a runtime flag to an inode to signal it should not be flushed, and for inodes with that flag set, start_delalloc_inodes() will simply skip them. When the cloning code needs to dirty a page to copy an inline extent, set that flag on the inode and then clear it when the clone operation finishes. This could be sporadically triggered with test case generic/269 from fstests, which exercises many fsstress processes running in parallel with several dd processes filling up the entire filesystem. CC: stable@vger.kernel.org # 5.9+ Fixes: 05a5a762 ("Btrfs: implement full reflink support for inline extents") Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Filipe Manana 提交于
stable inclusion from stable-5.10.8 commit 87738164592fdd531b068d069911aaa9f3c41c9d bugzilla: 47450 -------------------------------- [ Upstream commit f2f121ab ] Every time we log an inode we lookup in the fs/subvol tree for xattrs and if we have any, log them into the log tree. However it is very common to have inodes without any xattrs, so doing the search wastes times, but more importantly it adds contention on the fs/subvol tree locks, either making the logging code block and wait for tree locks or making the logging code making other concurrent operations block and wait. The most typical use cases where xattrs are used are when capabilities or ACLs are defined for an inode, or when SELinux is enabled. This change makes the logging code detect when an inode does not have xattrs and skip the xattrs search the next time the inode is logged, unless the inode is evicted and loaded again or a xattr is added to the inode. Therefore skipping the search for xattrs on inodes that don't ever have xattrs and are fsynced with some frequency. The following script that calls dbench was used to measure the impact of this change on a VM with 8 CPUs, 16Gb of ram, using a raw NVMe device directly (no intermediary filesystem on the host) and using a non-debug kernel (default configuration on Debian distributions): $ cat test.sh #!/bin/bash DEV=/dev/sdk MNT=/mnt/sdk MOUNT_OPTIONS="-o ssd" mkfs.btrfs -f -m single -d single $DEV mount $MOUNT_OPTIONS $DEV $MNT dbench -D $MNT -t 200 40 umount $MNT The results before this change: Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 5761605 0.172 312.057 Close 4232452 0.002 10.927 Rename 243937 1.406 277.344 Unlink 1163456 0.631 298.402 Deltree 160 11.581 221.107 Mkdir 80 0.003 0.005 Qpathinfo 5221410 0.065 122.309 Qfileinfo 915432 0.001 3.333 Qfsinfo 957555 0.003 3.992 Sfileinfo 469244 0.023 20.494 Find 2018865 0.448 123.659 WriteX 2874851 0.049 118.529 ReadX 9030579 0.004 21.654 LockX 18754 0.003 4.423 UnlockX 18754 0.002 0.331 Flush 403792 10.944 359.494 Throughput 908.444 MB/sec 40 clients 40 procs max_latency=359.500 ms The results after this change: Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 6442521 0.159 230.693 Close 4732357 0.002 10.972 Rename 272809 1.293 227.398 Unlink 1301059 0.563 218.500 Deltree 160 7.796 54.887 Mkdir 80 0.008 0.478 Qpathinfo 5839452 0.047 124.330 Qfileinfo 1023199 0.001 4.996 Qfsinfo 1070760 0.003 5.709 Sfileinfo 524790 0.033 21.765 Find 2257658 0.314 125.611 WriteX 3211520 0.040 232.135 ReadX 10098969 0.004 25.340 LockX 20974 0.003 1.569 UnlockX 20974 0.002 3.475 Flush 451553 10.287 331.037 Throughput 1011.77 MB/sec 40 clients 40 procs max_latency=331.045 ms +10.8% throughput, -8.2% max latency Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Arnd Bergmann 提交于
stable inclusion from stable-5.10.8 commit e28ace868c1e945f8c61cee147168e26d6c9f2d6 bugzilla: 47450 -------------------------------- [ Upstream commit 4c60244d ] clang complains about a possible code path in which a variable is used without an initialization: drivers/scsi/ufs/ufshcd.c:7690:3: error: variable 'sdp' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] BUG_ON(1); ^~~~~~~~~ include/asm-generic/bug.h:63:36: note: expanded from macro 'BUG_ON' #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0) ^~~~~~~~~~~~~~~~~~~ Turn the BUG_ON(1) into an unconditional BUG() that makes it clear to clang that this code path is never hit. Link: https://lore.kernel.org/r/20201203223137.1205933-1-arnd@kernel.org Fixes: 4f3e900b ("scsi: ufs: Clear UAC for FFU and RPMB LUNs") Reviewed-by: NAvri Altman <avri.altman@wdc.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Matthew Wilcox (Oracle) 提交于
stable inclusion from stable-5.10.8 commit 458b40598dc0ccbbb1d3522f56a287ea0a127165 bugzilla: 47450 -------------------------------- [ Upstream commit 3e2224c5 ] alloc_fixed_file_ref_node() currently returns an ERR_PTR on failure. io_sqe_files_unregister() expects it to return NULL and since it can only return -ENOMEM, it makes more sense to change alloc_fixed_file_ref_node() to behave that way. Fixes: 1ffc5422 ("io_uring: fix io_sqe_files_unregister() hangs") Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Steven Price 提交于
stable inclusion from stable-5.10.8 commit 51495b719515ddae417e4bafc7e100c34833af4b bugzilla: 47450 -------------------------------- [ Upstream commit a17d609e ] The mutex within the panfrost_queue_state should have the lifetime of the queue, however it was erroneously initialised/destroyed during panfrost_job_{open,close} which is called every time a client opens/closes the drm node. Move the initialisation/destruction to panfrost_job_{init,fini} where it belongs. Fixes: 1a11a88c ("drm/panfrost: Fix job timeout handling") Signed-off-by: NSteven Price <steven.price@arm.com> Reviewed-by: NBoris Brezillon <boris.brezillon@collabora.com> Signed-off-by: NBoris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201029170047.30564-1-steven.price@arm.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Bjorn Andersson 提交于
stable inclusion from stable-5.10.8 commit 9d7751a39a19b0090300b2b0498e397f9047e125 bugzilla: 47450 -------------------------------- [ Upstream commit aded8c7c ] On SM8150 it's occasionally observed that the boot hangs in between the writing of SMEs and context banks in arm_smmu_device_reset(). The problem seems to coincide with a display refresh happening after updating the stream mapping, but before clearing - and there by disabling translation - the context bank picked to emulate translation bypass. Resolve this by explicitly disabling the bypass context already in cfg_probe. Fixes: f9081b8f ("iommu/arm-smmu-qcom: Implement S2CR quirk") Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210106005038.4152731-1-bjorn.andersson@linaro.orgSigned-off-by: NWill Deacon <will@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Weihang Li 提交于
stable inclusion from stable-5.10.8 commit 85bbe2e64ab430af3c27a0bc4e22dae04a5e10e6 bugzilla: 47450 -------------------------------- [ Upstream commit 94a8c4df ] Only the low 12 bits of vlan_id is valid, and service level has been filled in Address Vector. So there is no need to fill sl in vlan_id in Address Vector. Fixes: 7406c003 ("RDMA/hns: Only record vlan info for HIP08") Link: https://lore.kernel.org/r/1607650657-35992-5-git-send-email-liweihang@huawei.comSigned-off-by: NWeihang Li <liweihang@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Pavel Begunkov 提交于
stable inclusion from stable-5.10.8 commit 85e25e2370a20352b72af34940fb32746a64fc28 bugzilla: 47450 -------------------------------- commit 6c503150 upstream IOPOLL skips completion locking but keeps it under uring_lock, thus io_cqring_overflow_flush() and so io_cqring_events() need additional locking with uring_lock in some cases for IOPOLL. Remove __io_cqring_overflow_flush() from io_cqring_events(), introduce a wrapper around flush doing needed synchronisation and call it by hand. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Pavel Begunkov 提交于
stable inclusion from stable-5.10.8 commit bc924dd21ecf8a8363091ef02fdac3115d024b91 bugzilla: 47450 -------------------------------- commit 89448c47 upstream We don't need to take uring_lock for SQPOLL|IOPOLL to do io_cqring_overflow_flush() when cq_overflow_list is empty, remove it from the hot path. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Pavel Begunkov 提交于
stable inclusion from stable-5.10.8 commit 1d5e50da5cc7483849b815ee34559be4f3902a3b bugzilla: 47450 -------------------------------- commit 81b6d05c upstream io_req_task_submit() might be called for IOPOLL, do the fail path under uring_lock to comply with IOPOLL synchronisation based solely on it. Cc: stable@vger.kernel.org # 5.5+ Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Christophe Leroy 提交于
stable inclusion from stable-5.10.8 commit bca9ca5a603f6c5586a7dfd35e06abe6d5fcd559 bugzilla: 47450 -------------------------------- [ Upstream commit 98bf2d3f ] When we have VMAP stack, exception prolog 1 sets r1, not r11. When it is not an RTAS machine check, don't trash r1 because it is needed by prolog 1. Fixes: da7bb43a ("powerpc/32: Fix vmap stack - Properly set r1 before activating MMU") Fixes: d2e00603 ("powerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu> [mpe: Squash in fixup for RTAS machine check from Christophe] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bc77d61d1c18940e456a2dee464f1e2eda65a3f0.1608621048.git.christophe.leroy@csgroup.euSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 6c7a6d22 category: bugfix bugzilla: 46882 CVE: NA -------------------------------- Commit aaac3733 ("ARM: kvm: replace open coded VA->PA calculations with adr_l call") removed all uses of .L__boot_cpu_mode_offset, so there is no longer a need to define it. Signed-off-by: NArd Biesheuvel <ardb@kernel.org> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit aaac3733 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Replace the open coded calculations of the actual physical address of the KVM stub vector table with a single adr_l invocation. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit aaac3733) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 3bcf906b category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Replace the open coded arithmetic with a simple adr_l/sub pair. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 3bcf906b) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit d74d2b22 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Replace the open coded PC relative offset calculations with adr_l and ldr_l invocations. This removes some open coded PC relative arithmetic, avoids literal pools on v7+, and slightly reduces the footprint of the code. Note that ALT_SMP() expects a single instruction so move the macro invocation after it. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit d74d2b22) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 59d2f282 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Now that calling __do_fixup_smp_on_up() can be done without passing the physical-to-virtual offset in r3, we can replace the open coded PC relative offset calculations with a pair of adr_l invocations. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 59d2f282) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 450abd38 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Currently, the .alt.smp.init section contains the virtual addresses of the patch sites. Since patching may occur both before and after switching into virtual mode, this requires some manual handling of the address when applying the UP alternative. Let's simplify this by using relative offsets in the table entries: this allows us to simply add each entry's address to its contents, regardless of whether we are running in virtual mode or not. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 450abd38) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 91580f0d category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Replace the open coded PC relative offset calculations with adr_l and ldr_l invocations. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Note that it also removes a stale comment about the contents of r6. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 91580f0d) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 172c34c9 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Replace the open coded PC relative offset calculations involving __turn_mmu_on and __turn_mmu_on_end with a pair of adr_l invocations. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 172c34c9) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 62c4a2e2 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Replace the open coded PC relative offset calculations with a pair of adr_l invocations. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 62c4a2e2) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 67e3f828 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- The ARM 'adrl' pseudo instruction is a bit problematic, as it does not exist in Thumb mode, and it is not implemented by Clang either. Since the Thumb variant has a slightly bigger range, it is sometimes necessary to emit the 'adrl' variant in ARM mode where Thumb mode can use adr just fine. However, that still leaves the Clang issue, which does not appear to be supporting this any time soon. So let's switch to the adr_l macro, which works for both ARM and Thumb, and has unlimited range. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 67e3f828) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 9443076e category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a physical address (PHYS_OFFSET) that is platform specific, and is discovered at boot. Since we don't want to slow down translations between physical and virtual addresses by keeping the offset in a variable in memory, we implement this by patching the code performing the translation, and putting the offset between PAGE_OFFSET and the start of physical RAM directly into the instruction opcodes. As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB of granularity, we have to round up PHYS_OFFSET to the next multiple if the start of physical RAM is not a multiple of 16 MiB. This wastes some physical RAM, since the memory that was skipped will now live below PAGE_OFFSET, making it inaccessible to the kernel. We can improve this by changing the patchable sequences and the patching logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB of granularity, and so we will never waste more than that amount by rounding up the physical start of DRAM to the next multiple of 2 MiB. (Note that 2 MiB granularity guarantees that the linear mapping can be created efficiently, whereas less than 2 MiB may result in the linear mapping needing another level of page tables) This helps Zhen Lei's scenario, where the start of DRAM is known to be occupied. It also helps EFI boot, which relies on the firmware's page allocator to allocate space for the decompressed kernel as low as possible. And if the KASLR patches ever land for 32-bit, it will give us 3 more bits of randomization of the placement of the kernel inside the linear region. For the ARM code path, it simply comes down to using two add/sub instructions instead of one for the carryless version, and patching each of them with the correct immediate depending on the rotation field. For the LPAE calculation, which has to deal with a carry, it patches the MOVW instruction with up to 12 bits of offset (but we only need 11 bits anyway) For the Thumb2 code path, patching more than 11 bits of displacement would be somewhat cumbersome, but the 11 bits we need fit nicely into the second word of the u16[2] opcode, so we simply update the immediate assignment and the left shift to create an addend of the right magnitude. Suggested-by: NZhen Lei <thunder.leizhen@huawei.com> Acked-by: NNicolas Pitre <nico@fluxnic.net> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 9443076e) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit e8e00f5a category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- In preparation for reducing the phys-to-virt minimum relative alignment from 16 MiB to 2 MiB, switch to patchable sequences involving MOVW instructions that can more easily be manipulated to carry a 12-bit immediate. Note that the non-LPAE ARM sequence is not updated: MOVW may not be supported on non-LPAE platforms, and the sequence itself can be updated more easily to apply the 12 bits of displacement. For Thumb2, which has many more versions of opcodes, switch to a sequence that can be patched by the same patching code for both versions. Note that the Thumb2 opcodes for MOVW and MVN are unambiguous, and have no rotation bits in their immediate fields, so there is no need to use placeholder constants in the asm blocks. While at it, drop the 'volatile' qualifiers from the asm blocks: the code does not have any side effects that are invisible to the compiler, so it is free to omit these sequences if the outputs are not used. Suggested-by: NRussell King <linux@armlinux.org.uk> Acked-by: NNicolas Pitre <nico@fluxnic.net> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit e8e00f5a) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 0e3db6c9 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Declutter the code in __fixup_pv_table() by using the new adr_l/str_l macros to take PC relative references to external symbols, and by using the value of PHYS_OFFSET passed in r8 to calculate the p2v offset. Acked-by: NNicolas Pitre <nico@fluxnic.net> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 0e3db6c9) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 2730e8ea category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Free up a register in the p2v patching code by switching to relative references, which don't require keeping the phys-to-virt displacement live in a register. Acked-by: NNicolas Pitre <nico@fluxnic.net> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 2730e8ea) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 0869f3b9 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- We always pass the same value for 'type' so pull it into the __pv_stub macro itself. Acked-by: NNicolas Pitre <nico@fluxnic.net> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 0869f3b9) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 7a94849e category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- The big and little endian versions of the ARM p2v patching routine only differ in the values of the constants, so factor those out into macros so that we only have one version of the logic sequence to maintain. Acked-by: NNicolas Pitre <nico@fluxnic.net> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 7a94849e) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 4b16421c category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- The ARM and Thumb2 versions of the p2v patching loop have some overlap at the end of the loop, so factor that out. As numeric labels are not required to be unique, and may therefore be ambiguous, use named local labels for the start and end of the loop instead. Acked-by: NNicolas Pitre <nico@fluxnic.net> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 4b16421c) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit eae78e1a category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Move the phys2virt patching code into a separate .S file before doing some work on it. Suggested-by: NNicolas Pitre <nico@fluxnic.net> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit eae78e1a) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 22f2d230 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- When using the new adr_l/ldr_l/str_l macros to refer to external symbols from modules, the linker may emit place relative ELF relocations that need to be fixed up by the module loader. So add support for these. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 22f2d230) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.11-rc1 commit 0b167463 category: bugfix bugzilla: 46882 CVE: NA ------------------------------------------------- Like arm64, ARM supports position independent code sequences that produce symbol references with a greater reach than the ordinary adr/ldr instructions. Since on ARM, the adrl pseudo-instruction is only supported in ARM mode (and not at all when using Clang), having a adr_l macro like we do on arm64 is useful, and increases symmetry as well. Currently, we use open coded instruction sequences involving literals and arithmetic operations. Instead, we can use movw/movt pairs on v7 CPUs, circumventing the D-cache entirely. E.g., on v7+ CPUs, we can emit a PC-relative reference as follows: movw <reg>, #:lower16:<sym> - (1f + 8) movt <reg>, #:upper16:<sym> - (1f + 8) 1: add <reg>, <reg>, pc For older CPUs, we can emit the literal into a subsection, allowing it to be emitted out of line while retaining the ability to perform arithmetic on label offsets. E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows: ldr <reg>, 2f 1: add <reg>, <reg>, pc .subsection 1 2: .long <sym> - (1b + 8) .previous This is allowed by the assembler because, unlike ordinary sections, subsections are combined into a single section in the object file, and so the label references are not true cross-section references that are visible as relocations. (Subsections have been available in binutils since 2004 at least, so they should not cause any issues with older toolchains.) So use the above to implement the macros mov_l, adr_l, ldr_l and str_l, all of which will use movw/movt pairs on v7 and later CPUs, and use PC-relative literals otherwise. Reviewed-by: NNicolas Pitre <nico@fluxnic.net> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> (cherry picked from commit 0b167463) Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 David Disseldorp 提交于
stable inclusion from stable-5.10.7 commit 6f1e88527c1869de08632efa2cc796e0131850dc bugzilla: 47429 -------------------------------- commit 2896c938 upstream. When attempting to match EXTENDED COPY CSCD descriptors with corresponding se_devices, target_xcopy_locate_se_dev_e4() currently iterates over LIO's global devices list which includes all configured backstores. This change ensures that only initiator-accessible backstores are considered during CSCD descriptor lookup, according to the session's se_node_acl LUN list. To avoid LUN removal race conditions, device pinning is changed from being configfs based to instead using the se_node_acl lun_ref. Reference: CVE-2020-28374 Fixes: cbf031f4 ("target: Add support for EXTENDED_COPY copy offload emulation") Reviewed-by: NLee Duncan <lduncan@suse.com> Signed-off-by: NDavid Disseldorp <ddiss@suse.de> Signed-off-by: NMike Christie <michael.christie@oracle.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ping-Ke Shih 提交于
stable inclusion from stable-5.10.7 commit 513729aecb53cdd0ba4e5e5aebc8b2fddcb0131e bugzilla: 47429 -------------------------------- commit 4dfde294 upstream. request_firmware_nowait() which schedules another work is used to load firmware when USB is probing. If USB is unplugged before running the firmware work, it goes disconnect ops, and then causes use-after-free. Though we wait for completion of firmware work before freeing the hw, firmware callback rises completion too early. So I move it to the last step. usb 5-1: Direct firmware load for rtlwifi/rtl8192cufw.bin failed with error -2 rtlwifi: Loading alternative firmware rtlwifi/rtl8192cufw.bin rtlwifi: Selected firmware is not available Acked-by: NXie XiuQi <xiexiuqi@huawei.com> ================================================================== BUG: KASAN: use-after-free in rtl_fw_do_work.cold+0x68/0x6a drivers/net/wireless/realtek/rtlwifi/core.c:93 Write of size 4 at addr ffff8881454cff50 by task kworker/0:6/7379 CPU: 0 PID: 7379 Comm: kworker/0:6 Not tainted 5.10.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events request_firmware_work_func Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385 __kasan_report mm/kasan/report.c:545 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562 rtl_fw_do_work.cold+0x68/0x6a drivers/net/wireless/realtek/rtlwifi/core.c:93 request_firmware_work_func+0x12c/0x230 drivers/base/firmware_loader/main.c:1079 process_one_work+0x933/0x1520 kernel/workqueue.c:2272 worker_thread+0x64c/0x1120 kernel/workqueue.c:2418 kthread+0x38c/0x460 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 The buggy address belongs to the page: page:00000000f54435b3 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1454cf flags: 0x200000000000000() raw: 0200000000000000 0000000000000000 ffffea00051533c8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8881454cfe00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8881454cfe80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff8881454cff00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff8881454cff80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8881454d0000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Reported-by: syzbot+65be4277f3c489293939@syzkaller.appspotmail.com Signed-off-by: NPing-Ke Shih <pkshih@realtek.com> Signed-off-by: NKalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20201214053106.7748-1-pkshih@realtek.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com>
-
由 Magnus Karlsson 提交于
stable inclusion from stable-5.10.7 commit 0fae7d269ef7343e052bb66d4f79022e4456fe82 bugzilla: 47429 -------------------------------- commit 8bee6833 upstream. Fix a possible memory leak when a bind of an AF_XDP socket fails. When the fill and completion rings are created, they are tied to the socket. But when the buffer pool is later created at bind time, the ownership of these two rings are transferred to the buffer pool as they might be shared between sockets (and the buffer pool cannot be created until we know what we are binding to). So, before the buffer pool is created, these two rings are cleaned up with the socket, and after they have been transferred they are cleaned up together with the buffer pool. The problem is that ownership was transferred before it was absolutely certain that the buffer pool could be created and initialized correctly and when one of these errors occurred, the fill and completion rings did neither belong to the socket nor the pool and where therefore leaked. Solve this by moving the ownership transfer to the point where the buffer pool has been completely set up and there is no way it can fail. Fixes: 7361f9c3 ("xsk: Move fill and completion rings to buffer pool") Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NBjörn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20201214085127.3960-1-magnus.karlsson@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Paolo Bonzini 提交于
stable inclusion from stable-5.10.7 commit 563135ec664ffb80a2297e94d618b04b228a1262 bugzilla: 47429 -------------------------------- commit 2f80d502 upstream. Since we know that e >= s, we can reassociate the left shift, changing the shifted number from 1 to 2 in exchange for decreasing the right hand side by 1. Reported-by: syzbot+e87846c48bf72bc85311@syzkaller.appspotmail.com Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ying-Tsun Huang 提交于
stable inclusion from stable-5.10.7 commit 02ccda90ef7e23a225b68789bce9e8353f9caa1f bugzilla: 47429 -------------------------------- commit cb7f4a8b upstream. In mtrr_type_lookup(), if the input memory address region is not in the MTRR, over 4GB, and not over the top of memory, a write-back attribute is returned. These condition checks are for ensuring the input memory address region is actually mapped to the physical memory. However, if the end address is just aligned with the top of memory, the condition check treats the address is over the top of memory, and write-back attribute is not returned. And this hits in a real use case with NVDIMM: the nd_pmem module tries to map NVDIMMs as cacheable memories when NVDIMMs are connected. If a NVDIMM is the last of the DIMMs, the performance of this NVDIMM becomes very low since it is aligned with the top of memory and its memory type is uncached-minus. Move the input end address change to inclusive up into mtrr_type_lookup(), before checking for the top of memory in either mtrr_type_lookup_{variable,fixed}() helpers. [ bp: Massage commit message. ] Fixes: 0cc705f5 ("x86/mm/mtrr: Clean up mtrr_type_lookup()") Signed-off-by: NYing-Tsun Huang <ying-tsun.huang@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201215070721.4349-1-ying-tsun.huang@amd.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Dan Carpenter 提交于
stable inclusion from stable-5.10.7 commit 6e3c67976eda30959833d852bc13c7d0a342cfa9 bugzilla: 47429 -------------------------------- commit ff58f7dd upstream. The clean up is off by one so this will start at "i" and it should start with "i - 1" and then it doesn't unregister the zeroeth elements in the array. Fixes: c52ca478 ("dmaengine: idxd: add configuration component of driver") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NDave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/X9nFeojulsNqUSnG@mwandaSigned-off-by: NVinod Koul <vkoul@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Pablo Neira Ayuso 提交于
stable inclusion from stable-5.10.7 commit 8b109f4cd1dc2224f900702483be81d61beab864 bugzilla: 47429 -------------------------------- commit 95cd4bca upstream. If userspace requests a feature which is not available the original set definition, then bail out with EOPNOTSUPP. If userspace sends unsupported dynset flags (new feature not supported by this kernel), then report EOPNOTSUPP to userspace. EINVAL should be only used to report malformed netlink messages from userspace. Fixes: 22fe54d5 ("netfilter: nf_tables: add support for dynamic set updates") Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-