- 03 11月, 2022 40 次提交
-
-
由 Dokyung Song 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5YGD6 CVE: CVE-2022-3628 Reference: https://patchwork.kernel.org/project/linux-wireless/patch/20221021061359.GA550858@laguna/ -------------------------------- This patch fixes an intra-object buffer overflow in brcmfmac that occurs when the device provides a 'bsscfgidx' equal to or greater than the buffer size. The patch adds a check that leads to a safe failure if that is the case. This fixes CVE-2022-3628. UBSAN: array-index-out-of-bounds in drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c index 52 is out of range for type 'brcmf_if *[16]' CPU: 0 PID: 1898 Comm: kworker/0:2 Tainted: G O 5.14.0+ #132 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Workqueue: events brcmf_fweh_event_worker Call Trace: dump_stack_lvl+0x57/0x7d ubsan_epilogue+0x5/0x40 __ubsan_handle_out_of_bounds+0x69/0x80 ? memcpy+0x39/0x60 brcmf_fweh_event_worker+0xae1/0xc00 ? brcmf_fweh_call_event_handler.isra.0+0x100/0x100 ? rcu_read_lock_sched_held+0xa1/0xd0 ? rcu_read_lock_bh_held+0xb0/0xb0 ? lockdep_hardirqs_on_prepare+0x273/0x3e0 process_one_work+0x873/0x13e0 ? lock_release+0x640/0x640 ? pwq_dec_nr_in_flight+0x320/0x320 ? rwlock_bug.part.0+0x90/0x90 worker_thread+0x8b/0xd10 ? __kthread_parkme+0xd9/0x1d0 ? process_one_work+0x13e0/0x13e0 kthread+0x379/0x450 ? _raw_spin_unlock_irq+0x24/0x30 ? set_kthread_struct+0x100/0x100 ret_from_fork+0x1f/0x30 ================================================================================ general protection fault, probably for non-canonical address 0xe5601c0020023fff: 0000 [#1] SMP KASAN KASAN: maybe wild-memory-access in range [0x2b0100010011fff8-0x2b0100010011ffff] CPU: 0 PID: 1898 Comm: kworker/0:2 Tainted: G O 5.14.0+ #132 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Workqueue: events brcmf_fweh_event_worker RIP: 0010:brcmf_fweh_call_event_handler.isra.0+0x42/0x100 Code: 89 f5 53 48 89 fb 48 83 ec 08 e8 79 0b 38 fe 48 85 ed 74 7e e8 6f 0b 38 fe 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 8b 00 00 00 4c 8b 7d 00 44 89 e0 48 ba 00 00 00 RSP: 0018:ffffc9000259fbd8 EFLAGS: 00010207 RAX: dffffc0000000000 RBX: ffff888115d8cd50 RCX: 0000000000000000 RDX: 0560200020023fff RSI: ffffffff8304bc91 RDI: ffff888115d8cd50 RBP: 2b0100010011ffff R08: ffff888112340050 R09: ffffed1023549809 R10: ffff88811aa4c047 R11: ffffed1023549808 R12: 0000000000000045 R13: ffffc9000259fca0 R14: ffff888112340050 R15: ffff888112340000 FS: 0000000000000000(0000) GS:ffff88811aa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000004053ccc0 CR3: 0000000112740000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: brcmf_fweh_event_worker+0x117/0xc00 ? brcmf_fweh_call_event_handler.isra.0+0x100/0x100 ? rcu_read_lock_sched_held+0xa1/0xd0 ? rcu_read_lock_bh_held+0xb0/0xb0 ? lockdep_hardirqs_on_prepare+0x273/0x3e0 process_one_work+0x873/0x13e0 ? lock_release+0x640/0x640 ? pwq_dec_nr_in_flight+0x320/0x320 ? rwlock_bug.part.0+0x90/0x90 worker_thread+0x8b/0xd10 ? __kthread_parkme+0xd9/0x1d0 ? process_one_work+0x13e0/0x13e0 kthread+0x379/0x450 ? _raw_spin_unlock_irq+0x24/0x30 ? set_kthread_struct+0x100/0x100 ret_from_fork+0x1f/0x30 Modules linked in: 88XXau(O) 88x2bu(O) ---[ end trace 41d302138f3ff55a ]--- RIP: 0010:brcmf_fweh_call_event_handler.isra.0+0x42/0x100 Code: 89 f5 53 48 89 fb 48 83 ec 08 e8 79 0b 38 fe 48 85 ed 74 7e e8 6f 0b 38 fe 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 8b 00 00 00 4c 8b 7d 00 44 89 e0 48 ba 00 00 00 RSP: 0018:ffffc9000259fbd8 EFLAGS: 00010207 RAX: dffffc0000000000 RBX: ffff888115d8cd50 RCX: 0000000000000000 RDX: 0560200020023fff RSI: ffffffff8304bc91 RDI: ffff888115d8cd50 RBP: 2b0100010011ffff R08: ffff888112340050 R09: ffffed1023549809 R10: ffff88811aa4c047 R11: ffffed1023549808 R12: 0000000000000045 R13: ffffc9000259fca0 R14: ffff888112340050 R15: ffff888112340000 FS: 0000000000000000(0000) GS:ffff88811aa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000004053ccc0 CR3: 0000000112740000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Reported-by: NDokyung Song <dokyungs@yonsei.ac.kr> Reported-by: NJisoo Jang <jisoo.jang@yonsei.ac.kr> Reported-by: NMinsuk Kang <linuxlovemin@yonsei.ac.kr> Reviewed-by: NArend van Spriel <aspriel@gmail.com> Signed-off-by: NDokyung Song <dokyung.song@gmail.com> Signed-off-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ryusuke Konishi 提交于
mainline inclusion from mainline-v6.0-rc3 commit 21a87d88 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5X1Z4 CVE: CVE-2022-3621 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=21a87d88c2253350e115029f14fe2a10a7e6c856 -------------------------------- If the i_mode field in inode of metadata files is corrupted on disk, it can cause the initialization of bmap structure, which should have been called from nilfs_read_inode_common(), not to be called. This causes a lockdep warning followed by a NULL pointer dereference at nilfs_bmap_lookup_at_level(). This patch fixes these issues by adding a missing sanitiy check for the i_mode field of metadata file's inode. Link: https://lkml.kernel.org/r/20221002030804.29978-1-konishi.ryusuke@gmail.comSigned-off-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+2b32eb36c1a825b7a74c@syzkaller.appspotmail.com Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLong Li <leo.lilong@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Baolin Wang 提交于
mainline inclusion from mainline-v6.1-rc1 commit fac35ba7 category: bugfix bugzilla: 187864, https://gitee.com/src-openeuler/kernel/issues/I5X1Z9 CVE: CVE-2022-3623 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=fac35ba763ed07ba93154c95ffc0c4a55023707f -------------------------------- On some architectures (like ARM64), it can support CONT-PTE/PMD size hugetlb, which means it can support not only PMD/PUD size hugetlb (2M and 1G), but also CONT-PTE/PMD size(64K and 32M) if a 4K page size specified. So when looking up a CONT-PTE size hugetlb page by follow_page(), it will use pte_offset_map_lock() to get the pte entry lock for the CONT-PTE size hugetlb in follow_page_pte(). However this pte entry lock is incorrect for the CONT-PTE size hugetlb, since we should use huge_pte_lock() to get the correct lock, which is mm->page_table_lock. That means the pte entry of the CONT-PTE size hugetlb under current pte lock is unstable in follow_page_pte(), we can continue to migrate or poison the pte entry of the CONT-PTE size hugetlb, which can cause some potential race issues, even though they are under the 'pte lock'. For example, suppose thread A is trying to look up a CONT-PTE size hugetlb page by move_pages() syscall under the lock, however antoher thread B can migrate the CONT-PTE hugetlb page at the same time, which will cause thread A to get an incorrect page, if thread A also wants to do page migration, then data inconsistency error occurs. Moreover we have the same issue for CONT-PMD size hugetlb in follow_huge_pmd(). To fix above issues, rename the follow_huge_pmd() as follow_huge_pmd_pte() to handle PMD and PTE level size hugetlb, which uses huge_pte_lock() to get the correct pte entry lock to make the pte entry stable. Mike said: Support for CONT_PMD/_PTE was added with bb9dd3df ("arm64: hugetlb: refactor find_num_contig()"). Patch series "Support for contiguous pte hugepages", v4. However, I do not believe these code paths were executed until migration support was added with 5480280d ("arm64/mm: enable HugeTLB migration for contiguous bit HugeTLB pages") I would go with 5480280d for the Fixes: targe. Link: https://lkml.kernel.org/r/635f43bdd85ac2615a58405da82b4d33c6e5eb05.1662017562.git.baolin.wang@linux.alibaba.com Fixes: 5480280d ("arm64/mm: enable HugeTLB migration for contiguous bit HugeTLB pages") Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Suggested-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: mm/hugetlb.c Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Tadeusz Struk 提交于
stable inclusion from stable-v5.10.148 commit 1b257f97fec43d7a8a4c9ada8538d14421861b0a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5XTU4 CVE: CVE-2022-43750 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1b257f97fec43d7a8a4c9ada8538d14421861b0a -------------------------------- commit a659daf6 upstream. Syzbot found an issue in usbmon module, where the user space client can corrupt the monitor's internal memory, causing the usbmon module to crash the kernel with segfault, UAF, etc. The reproducer mmaps the /dev/usbmon memory to user space, and overwrites it with arbitrary data, which causes all kinds of issues. Return an -EPERM error from mon_bin_mmap() if the flag VM_WRTIE is set. Also clear VM_MAYWRITE to make it impossible to change it to writable later. Cc: "Dmitry Vyukov" <dvyukov@google.com> Cc: stable <stable@kernel.org> Fixes: 6f23ee1f ("USB: add binary API to usbmon") Suggested-by: PaX Team <pageexec@freemail.hu> # for the VM_MAYRITE portion Link: https://syzkaller.appspot.com/bug?id=2eb1f35d6525fa4a74d75b4244971e5b1411c95a Reported-by: syzbot+23f57c5ae902429285d7@syzkaller.appspotmail.com Signed-off-by: NTadeusz Struk <tadeusz.struk@linaro.org> Link: https://lore.kernel.org/r/20220919215957.205681-1-tadeusz.struk@linaro.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZhao Wenhui <zhaowenhui8@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 David Jeffery 提交于
mainline inclusion from mainline-v5.18-rc1 commit 8f5fea65 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YREM CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?8f5fea65b06de1cc51d4fc23fb4d378d1abd6ed7 -------------------------------- When blk_mq_delay_run_hw_queues sets an hctx to run in the future, it can reset the delay length for an already pending delayed work run_work. This creates a scenario where multiple hctx may have their queues set to run, but if one runs first and finds nothing to do, it can reset the delay of another hctx and stall the other hctx's ability to run requests. To avoid this I/O stall when an hctx's run_work is already pending, leave it untouched to run at its current designated time rather than extending its delay. The work will still run which keeps closed the race calling blk_mq_delay_run_hw_queues is needed for while also avoiding the I/O stall. Signed-off-by: NDavid Jeffery <djeffery@redhat.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220131203337.GA17666@redhatSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YRAC CVE: NA -------------------------------- This reverts commit 84f7a9de. Because it introduces a problem that rq->__data_len is set to the wrong value. before the patch: 1) nr_bytes = rq->__data_len 2) rq->__data_len = sdp->sector_size 3) scsi_init_io() 4) rq->__data_len = nr_bytes after the patch: 1) rq->__data_len = sdp->sector_size 2) scsi_init_io() 3) rq->__data_len = rq->__data_len -> __data_len is wrong It will cause that io can only complete one segment each time, and the io will requeue in scsi_io_completion_action(), which will cause severe performance degradation. Scsi write same is removed in commit e383e16e ("scsi: sd: Remove WRITE_SAME support") from mainline, hence this patch is only needed for stable kernels. Fixes: 84f7a9de ("scsi: sd: Remove a local variable") Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ido Schimmel 提交于
stable inclusion from stable-v5.10.138 commit 0e28678a770df7989108327cfe86f835d8760c33 category: bugfix bugzilla: 187888 CVE: CVE-2022-3625 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=0e28678a770df7989108327cfe86f835d8760c33 -------------------------------- commit 6b4db2e5 upstream. After a failed devlink reload, devlink parameters are still registered, which means user space can set and get their values. In the case of the mlxsw "acl_region_rehash_interval" parameter, these operations will trigger a use-after-free [1]. Fix this by rejecting set and get operations while in the failed state. Return the "-EOPNOTSUPP" error code which does not abort the parameters dump, but instead causes it to skip over the problematic parameter. Another possible fix is to perform these checks in the mlxsw parameter callbacks, but other drivers might be affected by the same problem and I am not aware of scenarios where these stricter checks will cause a regression. [1] mlxsw_spectrum3 0000:00:10.0: Port 125: Failed to register netdev mlxsw_spectrum3 0000:00:10.0: Failed to create ports ================================================================== BUG: KASAN: use-after-free in mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xbd/0xd0 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c:904 Read of size 4 at addr ffff8880099dcfd8 by task kworker/u4:4/777 CPU: 1 PID: 777 Comm: kworker/u4:4 Not tainted 5.19.0-rc7-custom-126601-gfe26f28c586d #1 Hardware name: QEMU MSN4700, BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: netns cleanup_net Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x92/0xbd lib/dump_stack.c:106 print_address_description mm/kasan/report.c:313 [inline] print_report.cold+0x5e/0x5cf mm/kasan/report.c:429 kasan_report+0xb9/0xf0 mm/kasan/report.c:491 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report_generic.c:306 mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xbd/0xd0 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c:904 mlxsw_sp_acl_region_rehash_intrvl_get+0x49/0x60 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c:1106 mlxsw_sp_params_acl_region_rehash_intrvl_get+0x33/0x80 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3854 devlink_param_get net/core/devlink.c:4981 [inline] devlink_nl_param_fill+0x238/0x12d0 net/core/devlink.c:5089 devlink_param_notify+0xe5/0x230 net/core/devlink.c:5168 devlink_ns_change_notify net/core/devlink.c:4417 [inline] devlink_ns_change_notify net/core/devlink.c:4396 [inline] devlink_reload+0x15f/0x700 net/core/devlink.c:4507 devlink_pernet_pre_exit+0x112/0x1d0 net/core/devlink.c:12272 ops_pre_exit_list net/core/net_namespace.c:152 [inline] cleanup_net+0x494/0xc00 net/core/net_namespace.c:582 process_one_work+0x9fc/0x1710 kernel/workqueue.c:2289 worker_thread+0x675/0x10b0 kernel/workqueue.c:2436 kthread+0x30c/0x3d0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 </TASK> The buggy address belongs to the physical page: page:ffffea0000267700 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x99dc flags: 0x100000000000000(node=0|zone=1) raw: 0100000000000000 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880099dce80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8880099dcf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff8880099dcf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff8880099dd000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8880099dd080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Fixes: 98bbf70c ("mlxsw: spectrum: add "acl_region_rehash_interval" devlink param") Signed-off-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NJiri Pirko <jiri@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NXu Jia <xujia39@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Alistair Popple 提交于
mainline inclusion from mainline-v6.1-rc2 commit 97061d44 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5VZ0L Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97061d441110528dc02972818f2f1dad485107f9 -------------------------------- Commit 16ce101d ("mm/memory.c: fix race when faulting a device private page") changed the migrate_to_ram() callback to take a reference on the device page to ensure it can't be freed while handling the fault. Unfortunately the corresponding update to Nouveau to accommodate this change was inadvertently dropped from that patch causing GPU to CPU migration to fail so add it here. Link: https://lkml.kernel.org/r/20221019122934.866205-1-apopple@nvidia.com Fixes: 16ce101d ("mm/memory.c: fix race when faulting a device private page") Signed-off-by: NAlistair Popple <apopple@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NMa Wupeng <mawupeng1@huawei.com> Reviewed-by: Ntong tiangen <tongtiangen@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Alistair Popple 提交于
mainline inclusion from mainline-v6.1-rc1 commit 16ce101d category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5VZ0L CVE: CVE-2022-3523 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=16ce101db85db694a91380aa4c89b25530871d33 -------------------------------- Patch series "Fix several device private page reference counting issues", v2 This series aims to fix a number of page reference counting issues in drivers dealing with device private ZONE_DEVICE pages. These result in use-after-free type bugs, either from accessing a struct page which no longer exists because it has been removed or accessing fields within the struct page which are no longer valid because the page has been freed. During normal usage it is unlikely these will cause any problems. However without these fixes it is possible to crash the kernel from userspace. These crashes can be triggered either by unloading the kernel module or unbinding the device from the driver prior to a userspace task exiting. In modules such as Nouveau it is also possible to trigger some of these issues by explicitly closing the device file-descriptor prior to the task exiting and then accessing device private memory. This involves some minor changes to both PowerPC and AMD GPU code. Unfortunately I lack hardware to test either of those so any help there would be appreciated. The changes mimic what is done in for both Nouveau and hmm-tests though so I doubt they will cause problems. This patch (of 8): When the CPU tries to access a device private page the migrate_to_ram() callback associated with the pgmap for the page is called. However no reference is taken on the faulting page. Therefore a concurrent migration of the device private page can free the page and possibly the underlying pgmap. This results in a race which can crash the kernel due to the migrate_to_ram() function pointer becoming invalid. It also means drivers can't reliably read the zone_device_data field because the page may have been freed with memunmap_pages(). Close the race by getting a reference on the page while holding the ptl to ensure it has not been freed. Unfortunately the elevated reference count will cause the migration required to handle the fault to fail. To avoid this failure pass the faulting page into the migrate_vma functions so that if an elevated reference count is found it can be checked to see if it's expected or not. [mpe@ellerman.id.au: fix build] Link: https://lkml.kernel.org/r/87fsgbf3gh.fsf@mpe.ellerman.id.au Link: https://lkml.kernel.org/r/cover.60659b549d8509ddecafad4f498ee7f03bb23c69.1664366292.git-series.apopple@nvidia.com Link: https://lkml.kernel.org/r/d3e813178a59e565e8d78d9b9a4e2562f6494f90.1664366292.git-series.apopple@nvidia.comSigned-off-by: NAlistair Popple <apopple@nvidia.com> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Lyude Paul <lyude@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: arch/powerpc/kvm/book3s_hv_uvmem.c include/linux/migrate.h lib/test_hmm.c mm/migrate.c Signed-off-by: NMa Wupeng <mawupeng1@huawei.com> Reviewed-by: Ntong tiangen <tongtiangen@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peilin Ye 提交于
stable inclusion from stable-v5.10.138 commit 38ddccbda5e8b762c8ee06670bb1f64f1be5ee50 category: bugfix bugzilla: 187891, https://gitee.com/src-openeuler/kernel/issues/I5WYLL CVE: CVE-2022-3629 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=38ddccbda5e8b762c8ee06670bb1f64f1be5ee50 -------------------------------- commit 7e97cfed upstream. An O_NONBLOCK vsock_connect() request may try to reschedule @connect_work. Imagine the following sequence of vsock_connect() requests: 1. The 1st, non-blocking request schedules @connect_work, which will expire after 200 jiffies. Socket state is now SS_CONNECTING; 2. Later, the 2nd, blocking request gets interrupted by a signal after a few jiffies while waiting for the connection to be established. Socket state is back to SS_UNCONNECTED, but @connect_work is still pending, and will expire after 100 jiffies. 3. Now, the 3rd, non-blocking request tries to schedule @connect_work again. Since @connect_work is already scheduled, schedule_delayed_work() silently returns. sock_hold() is called twice, but sock_put() will only be called once in vsock_connect_timeout(), causing a memory leak reported by syzbot: BUG: memory leak unreferenced object 0xffff88810ea56a40 (size 1232): comm "syz-executor756", pid 3604, jiffies 4294947681 (age 12.350s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 28 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 (..@............ backtrace: [<ffffffff837c830e>] sk_prot_alloc+0x3e/0x1b0 net/core/sock.c:1930 [<ffffffff837cbe22>] sk_alloc+0x32/0x2e0 net/core/sock.c:1989 [<ffffffff842ccf68>] __vsock_create.constprop.0+0x38/0x320 net/vmw_vsock/af_vsock.c:734 [<ffffffff842ce8f1>] vsock_create+0xc1/0x2d0 net/vmw_vsock/af_vsock.c:2203 [<ffffffff837c0cbb>] __sock_create+0x1ab/0x2b0 net/socket.c:1468 [<ffffffff837c3acf>] sock_create net/socket.c:1519 [inline] [<ffffffff837c3acf>] __sys_socket+0x6f/0x140 net/socket.c:1561 [<ffffffff837c3bba>] __do_sys_socket net/socket.c:1570 [inline] [<ffffffff837c3bba>] __se_sys_socket net/socket.c:1568 [inline] [<ffffffff837c3bba>] __x64_sys_socket+0x1a/0x20 net/socket.c:1568 [<ffffffff84512815>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff84512815>] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80 [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae <...> Use mod_delayed_work() instead: if @connect_work is already scheduled, reschedule it, and undo sock_hold() to keep the reference count balanced. Reported-and-tested-by: syzbot+b03f55bf128f9a38f064@syzkaller.appspotmail.com Fixes: d021c344 ("VSOCK: Introduce VM Sockets") Co-developed-by: NStefano Garzarella <sgarzare@redhat.com> Signed-off-by: NStefano Garzarella <sgarzare@redhat.com> Reviewed-by: NStefano Garzarella <sgarzare@redhat.com> Signed-off-by: NPeilin Ye <peilin.ye@bytedance.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NDong Chenchen <dongchenchen2@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Baokun Li 提交于
hulk inclusion category: bugfix bugzilla: 187821,https://gitee.com/openeuler/kernel/issues/I5X9U0 CVE: NA -------------------------------- We got a issue as fllows: ================================================================== kernel BUG at fs/ext4/extents_status.c:203! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 945 Comm: cat Not tainted 6.0.0-next-20221007-dirty #349 RIP: 0010:ext4_es_end.isra.0+0x34/0x42 RSP: 0018:ffffc9000143b768 EFLAGS: 00010203 RAX: 0000000000000000 RBX: ffff8881769cd0b8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8fc27cf7 RDI: 00000000ffffffff RBP: ffff8881769cd0bc R08: 0000000000000000 R09: ffffc9000143b5f8 R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881769cd0a0 R13: ffff8881768e5668 R14: 00000000768e52f0 R15: 0000000000000000 FS: 00007f359f7f05c0(0000)GS:ffff88842fd00000(0000)knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f359f5a2000 CR3: 000000017130c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __es_tree_search.isra.0+0x6d/0xf5 ext4_es_cache_extent+0xfa/0x230 ext4_cache_extents+0xd2/0x110 ext4_find_extent+0x5d5/0x8c0 ext4_ext_map_blocks+0x9c/0x1d30 ext4_map_blocks+0x431/0xa50 ext4_mpage_readpages+0x48e/0xe40 ext4_readahead+0x47/0x50 read_pages+0x82/0x530 page_cache_ra_unbounded+0x199/0x2a0 do_page_cache_ra+0x47/0x70 page_cache_ra_order+0x242/0x400 ondemand_readahead+0x1e8/0x4b0 page_cache_sync_ra+0xf4/0x110 filemap_get_pages+0x131/0xb20 filemap_read+0xda/0x4b0 generic_file_read_iter+0x13a/0x250 ext4_file_read_iter+0x59/0x1d0 vfs_read+0x28f/0x460 ksys_read+0x73/0x160 __x64_sys_read+0x1e/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> ================================================================== In the above issue, ioctl invokes the swap_inode_boot_loader function to swap inode<5> and inode<12>. However, inode<5> contain incorrect imode and disordered extents, and i_nlink is set to 1. The extents check for inode in the ext4_iget function can be bypassed bacause 5 is EXT4_BOOT_LOADER_INO. While links_count is set to 1, the extents are not initialized in swap_inode_boot_loader. After the ioctl command is executed successfully, the extents are swapped to inode<12>, in this case, run the `cat` command to view inode<12>. And Bug_ON is triggered due to the incorrect extents. When the boot loader inode is not initialized, its imode can be one of the following: 1) the imode is a bad type, which is marked as bad_inode in ext4_iget and set to S_IFREG. 2) the imode is good type but not S_IFREG. 3) the imode is S_IFREG. The BUG_ON may be triggered by bypassing the check in cases 1 and 2. Therefore, when the boot loader inode is bad_inode or its imode is not S_IFREG, initialize the inode to avoid triggering the BUG. Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Baokun Li 提交于
hulk inclusion category: bugfix bugzilla: 187821,https://gitee.com/openeuler/kernel/issues/I5X9U0 CVE: NA -------------------------------- There are many places that will get unhappy (and crash) when ext4_iget() returns a bad inode. However, if iget the boot loader inode, allows a bad inode to be returned, because the inode may not be initialized. This mechanism can be used to bypass some checks and cause panic. To solve this problem, we add a special iget flag EXT4_IGET_BAD. Only with this flag we'd be returning bad inode from ext4_iget(), otherwise we always return the error code if the inode is bad inode.(suggested by Jan Kara) Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Baokun Li 提交于
hulk inclusion category: bugfix bugzilla: 187821,https://gitee.com/openeuler/kernel/issues/I5X9U0 CVE: NA -------------------------------- Before quota is enabled, a check on the preset quota inums in ext4_super_block is added to prevent wrong quota inodes from being loaded. In addition, when the quota fails to be enabled, the quota type and quota inum are printed to facilitate fault locating. Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Baokun Li 提交于
hulk inclusion category: bugfix bugzilla: 187821,https://gitee.com/openeuler/kernel/issues/I5X9U0 CVE: NA -------------------------------- We got a issue as fllows: ================================================================== kernel BUG at fs/ext4/extents_status.c:202! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 810 Comm: mount Not tainted 6.1.0-rc1-next-g9631525255e3 #352 RIP: 0010:__es_tree_search.isra.0+0xb8/0xe0 RSP: 0018:ffffc90001227900 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 0000000077512a0f RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000002a10 RDI: ffff8881004cd0c8 RBP: ffff888177512ac8 R08: 47ffffffffffffff R09: 0000000000000001 R10: 0000000000000001 R11: 00000000000679af R12: 0000000000002a10 R13: ffff888177512d88 R14: 0000000077512a10 R15: 0000000000000000 FS: 00007f4bd76dbc40(0000)GS:ffff88842fd00000(0000)knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005653bf993cf8 CR3: 000000017bfdf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ext4_es_cache_extent+0xe2/0x210 ext4_cache_extents+0xd2/0x110 ext4_find_extent+0x5d5/0x8c0 ext4_ext_map_blocks+0x9c/0x1d30 ext4_map_blocks+0x431/0xa50 ext4_getblk+0x82/0x340 ext4_bread+0x14/0x110 ext4_quota_read+0xf0/0x180 v2_read_header+0x24/0x90 v2_check_quota_file+0x2f/0xa0 dquot_load_quota_sb+0x26c/0x760 dquot_load_quota_inode+0xa5/0x190 ext4_enable_quotas+0x14c/0x300 __ext4_fill_super+0x31cc/0x32c0 ext4_fill_super+0x115/0x2d0 get_tree_bdev+0x1d2/0x360 ext4_get_tree+0x19/0x30 vfs_get_tree+0x26/0xe0 path_mount+0x81d/0xfc0 do_mount+0x8d/0xc0 __x64_sys_mount+0xc0/0x160 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> ================================================================== Above issue may happen as follows: ------------------------------------- ext4_fill_super ext4_orphan_cleanup ext4_enable_quotas ext4_quota_enable ext4_iget --> get error inode <5> ext4_ext_check_inode --> Wrong imode makes it escape inspection make_bad_inode(inode) --> EXT4_BOOT_LOADER_INO set imode dquot_load_quota_inode vfs_setup_quota_inode --> check pass dquot_load_quota_sb v2_check_quota_file v2_read_header ext4_quota_read ext4_bread ext4_getblk ext4_map_blocks ext4_ext_map_blocks ext4_find_extent ext4_cache_extents ext4_es_cache_extent __es_tree_search.isra.0 ext4_es_end --> Wrong extents trigger BUG_ON In the above issue, s_usr_quota_inum is set to 5, but inode<5> contains incorrect imode and disordered extents. Because 5 is EXT4_BOOT_LOADER_INO, the ext4_ext_check_inode check in the ext4_iget function can be bypassed, finally, the extents that are not checked trigger the BUG_ON in the __es_tree_search function. To solve this issue, check whether the inode is bad_inode in vfs_setup_quota_inode(). Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Pablo Neira Ayuso 提交于
stable inclusion from stable-v5.10.146 commit 5d75fef3e61e797fab5c3fbba88caa74ab92ad47 category: bugfix bugzilla: 187890, https://gitee.com/src-openeuler/kernel/issues/I5X2IJ CVE: CVE-2022-42432 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5d75fef3e61e797fab5c3fbba88caa74ab92ad47 -------------------------------- [ Upstream commit 559c36c5 ] nf_osf_find() incorrectly returns true on mismatch, this leads to copying uninitialized memory area in nft_osf which can be used to leak stale kernel stack data to userspace. Fixes: 22c7652c ("netfilter: nft_osf: Add version option support") Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NLu Wei <luwei32@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ryusuke Konishi 提交于
mainline inclusion from mainline-v6.1-rc1 commit d0d51a97 category: bugfix bugzilla: 187884,https://gitee.com/src-openeuler/kernel/issues/I5X2OB CVE: CVE-2022-3646 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0d51a97063db4704a5ef6bc978dddab1636a306 -------------------------------- If nilfs_attach_log_writer() failed to create a log writer thread, it frees a data structure of the log writer without any cleanup. After commit e912a5b6 ("nilfs2: use root object to get ifile"), this causes a leak of struct nilfs_root, which started to leak an ifile metadata inode and a kobject on that struct. In addition, if the kernel is booted with panic_on_warn, the above ifile metadata inode leak will cause the following panic when the nilfs2 kernel module is removed: kmem_cache_destroy nilfs2_inode_cache: Slab cache still has objects when called from nilfs_destroy_cachep+0x16/0x3a [nilfs2] WARNING: CPU: 8 PID: 1464 at mm/slab_common.c:494 kmem_cache_destroy+0x138/0x140 ... RIP: 0010:kmem_cache_destroy+0x138/0x140 Code: 00 20 00 00 e8 a9 55 d8 ff e9 76 ff ff ff 48 8b 53 60 48 c7 c6 20 70 65 86 48 c7 c7 d8 69 9c 86 48 8b 4c 24 28 e8 ef 71 c7 00 <0f> 0b e9 53 ff ff ff c3 48 81 ff ff 0f 00 00 77 03 31 c0 c3 53 48 ... Call Trace: <TASK> ? nilfs_palloc_freev.cold.24+0x58/0x58 [nilfs2] nilfs_destroy_cachep+0x16/0x3a [nilfs2] exit_nilfs_fs+0xa/0x1b [nilfs2] __x64_sys_delete_module+0x1d9/0x3a0 ? __sanitizer_cov_trace_pc+0x1a/0x50 ? syscall_trace_enter.isra.19+0x119/0x190 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd ... </TASK> Kernel panic - not syncing: panic_on_warn set ... This patch fixes these issues by calling nilfs_detach_log_writer() cleanup function if spawning the log writer thread fails. Link: https://lkml.kernel.org/r/20221007085226.57667-1-konishi.ryusuke@gmail.com Fixes: e912a5b6 ("nilfs2: use root object to get ifile") Signed-off-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+7381dc4ad60658ca4c05@syzkaller.appspotmail.com Tested-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Russell King (Oracle) 提交于
mainline inclusion from mainline-v6.0-rc7 commit 0152dfee category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5W7AX CVE: CVE-2022-3535 -------------------------------- When mvpp2 is unloaded, the driver specific debugfs directory is not removed, which technically leads to a memory leak. However, this directory is only created when the first device is probed, so the hardware is present. Removing the module is only something a developer would to when e.g. testing out changes, so the module would be reloaded. So this memory leak is minor. The original attempt in commit fe2c9c61 ("net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()") that was labelled as a memory leak fix was not, it fixed a refcount leak, but in doing so created a problem when the module is reloaded - the directory already exists, but mvpp2_root is NULL, so we lose all debugfs entries. This fix has been reverted. This is the alternative fix, where we remove the offending directory whenever the driver is unloaded. Fixes: 21da57a2 ("net: mvpp2: add a debugfs interface for the Header Parser") Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NMarcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/r/E1ofOAB-00CzkG-UO@rmk-PC.armlinux.org.ukSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: Nzhangqiao <zhangqiao22@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Chen Zhongjin 提交于
mainline inclusion from mainline-v6.0-rc3 commit fc2e426b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q4RA CVE: NA -------------------------------- When meeting ftrace trampolines in ORC unwinding, unwinder uses address of ftrace_{regs_}call address to find the ORC entry, which gets next frame at sp+176. If there is an IRQ hitting at sub $0xa8,%rsp, the next frame should be sp+8 instead of 176. It makes unwinder skip correct frame and throw warnings such as "wrong direction" or "can't access registers", etc, depending on the content of the incorrect frame address. By adding the base address ftrace_{regs_}caller with the offset *ip - ops->trampoline*, we can get the correct address to find the ORC entry. Also change "caller" to "tramp_addr" to make variable name conform to its content. [ mingo: Clarified the changelog a bit. ] Fixes: 6be7fa3c ("ftrace, orc, x86: Handle ftrace dynamically allocated trampolines") Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NSteven Rostedt (Google) <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220819084334.244016-1-chenzhongjin@huawei.comSigned-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kuniyuki Iwashima 提交于
mainline inclusion from mainline-v6.0-rc3 commit 9c80e799 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q4Z0 CVE: NA -------------------------------- The assumption in __disable_kprobe() is wrong, and it could try to disarm an already disarmed kprobe and fire the WARN_ONCE() below. [0] We can easily reproduce this issue. 1. Write 0 to /sys/kernel/debug/kprobes/enabled. # echo 0 > /sys/kernel/debug/kprobes/enabled 2. Run execsnoop. At this time, one kprobe is disabled. # /usr/share/bcc/tools/execsnoop & [1] 2460 PCOMM PID PPID RET ARGS # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 3. Write 1 to /sys/kernel/debug/kprobes/enabled, which changes kprobes_all_disarmed to false but does not arm the disabled kprobe. # echo 1 > /sys/kernel/debug/kprobes/enabled # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 4. Kill execsnoop, when __disable_kprobe() calls disarm_kprobe() for the disabled kprobe and hits the WARN_ONCE() in __disarm_kprobe_ftrace(). # fg /usr/share/bcc/tools/execsnoop ^C Actually, WARN_ONCE() is fired twice, and __unregister_kprobe_top() misses some cleanups and leaves the aggregated kprobe in the hash table. Then, __unregister_trace_kprobe() initialises tk->rp.kp.list and creates an infinite loop like this. aggregated kprobe.list -> kprobe.list -. ^ | '.__.' In this situation, these commands fall into the infinite loop and result in RCU stall or soft lockup. cat /sys/kernel/debug/kprobes/list : show_kprobe_addr() enters into the infinite loop with RCU. /usr/share/bcc/tools/execsnoop : warn_kprobe_rereg() holds kprobe_mutex, and __get_valid_kprobe() is stuck in the loop. To avoid the issue, make sure we don't call disarm_kprobe() for disabled kprobes. [0] Failed to disarm kprobe-ftrace at __x64_sys_execve+0x0/0x40 (error -2) WARNING: CPU: 6 PID: 2460 at kernel/kprobes.c:1130 __disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Modules linked in: ena CPU: 6 PID: 2460 Comm: execsnoop Not tainted 5.19.0+ #28 Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:__disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Code: 24 8b 02 eb c1 80 3d c4 83 f2 01 00 75 d4 48 8b 75 00 89 c2 48 c7 c7 90 fa 0f 92 89 04 24 c6 05 ab 83 01 e8 e4 94 f0 ff <0f> 0b 8b 04 24 eb b1 89 c6 48 c7 c7 60 fa 0f 92 89 04 24 e8 cc 94 RSP: 0018:ffff9e6ec154bd98 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff930f7b00 RCX: 0000000000000001 RDX: 0000000080000001 RSI: ffffffff921461c5 RDI: 00000000ffffffff RBP: ffff89c504286da8 R08: 0000000000000000 R09: c0000000fffeffff R10: 0000000000000000 R11: ffff9e6ec154bc28 R12: ffff89c502394e40 R13: ffff89c502394c00 R14: ffff9e6ec154bc00 R15: 0000000000000000 FS: 00007fe800398740(0000) GS:ffff89c812d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c00057f010 CR3: 0000000103b54006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> __disable_kprobe (kernel/kprobes.c:1716) disable_kprobe (kernel/kprobes.c:2392) __disable_trace_kprobe (kernel/trace/trace_kprobe.c:340) disable_trace_kprobe (kernel/trace/trace_kprobe.c:429) perf_trace_event_unreg.isra.2 (./include/linux/tracepoint.h:93 kernel/trace/trace_event_perf.c:168) perf_kprobe_destroy (kernel/trace/trace_event_perf.c:295) _free_event (kernel/events/core.c:4971) perf_event_release_kernel (kernel/events/core.c:5176) perf_release (kernel/events/core.c:5186) __fput (fs/file_table.c:321) task_work_run (./include/linux/sched.h:2056 (discriminator 1) kernel/task_work.c:179 (discriminator 1)) exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49 kernel/entry/common.c:169 kernel/entry/common.c:201) syscall_exit_to_user_mode (./arch/x86/include/asm/jump_label.h:55 ./arch/x86/include/asm/nospec-branch.h:384 ./arch/x86/include/asm/entry-common.h:94 kernel/entry/common.c:133 kernel/entry/common.c:296) do_syscall_64 (arch/x86/entry/common.c:87) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7fe7ff210654 Code: 15 79 89 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb be 0f 1f 00 8b 05 9a cd 20 00 48 63 ff 85 c0 75 11 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3a f3 c3 48 83 ec 18 48 89 7c 24 08 e8 34 fc RSP: 002b:00007ffdbd1d3538 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007fe7ff210654 RDX: 0000000000000000 RSI: 0000000000002401 RDI: 0000000000000008 RBP: 0000000000000000 R08: 94ae31d6fda838a4 R0900007fe8001c9d30 R10: 00007ffdbd1d34b0 R11: 0000000000000246 R12: 00007ffdbd1d3600 R13: 0000000000000000 R14: fffffffffffffffc R15: 00007ffdbd1d3560 </TASK> Link: https://lkml.kernel.org/r/20220813020509.90805-1-kuniyu@amazon.com Fixes: 69d54b91 ("kprobes: makes kprobes/enabled works correctly for optimized kprobes.") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Reported-by: NAyushman Dutta <ayudutta@amazon.com> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Kuniyuki Iwashima <kuni1840@gmail.com> Cc: Ayushman Dutta <ayudutta@amazon.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X6RT CVE: NA -------------------------------- After introducing commit 5b18b5a7 ("block: delete part_round_stats and switch to less precise counting"), '%util' accounted by iostat will be over reality data. In fact, the device is quite idle, but iostat may show '%util' as a big number (e.g. 50%). It can produce by fio: fio --name=1 --direct=1 --bs=4k --rw=read --filename=/dev/sda \ --thinktime=4ms --runtime=180 We fix this by using a switch(precise_iostat=1) to control whether or not acconut ioticks precisely. fixes: 5b18b5a7 ("block: delete part_round_stats and switch to less precise counting") Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Mark Rutland 提交于
mainline inclusion from mainline-v6.0-rc3 commit 2e8cff0a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5PPS3?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2e8cff0a0eee87b27f0cf87ad8310eb41b5886ab -------------------------------- On arm64, "rodata=full" has been suppored (but not documented) since commit: c55191e9 ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well") As it's necessary to determine the rodata configuration early during boot, arm64 has an early_param() handler for this, whereas init/main.c has a __setup() handler which is run later. Unfortunately, this split meant that since commit: f9a40b08 ("init/main.c: return 1 from handled __setup() functions") ... passing "rodata=full" would result in a spurious warning from the __setup() handler (though RO permissions would be configured appropriately). Further, "rodata=full" has been broken since commit: 0d6ea3ac ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()") ... which caused strtobool() to parse "full" as false (in addition to many other values not documented for the "rodata=" kernel parameter. This patch fixes this breakage by: * Moving the core parameter parser to an __early_param(), such that it is available early. * Adding an (optional) arch hook which arm64 can use to parse "full". * Updating the documentation to mention that "full" is valid for arm64. * Having the core parameter parser handle "on" and "off" explicitly, such that any undocumented values (e.g. typos such as "ful") are reported as errors rather than being silently accepted. Note that __setup() and early_param() have opposite conventions for their return values, where __setup() uses 1 to indicate a parameter was handled and early_param() uses 0 to indicate a parameter was handled. Fixes: f9a40b08 ("init/main.c: return 1 from handled __setup() functions") Fixes: 0d6ea3ac ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()") Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jagdish Gediya <jvgediya@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: NArd Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220817154022.3974645-1-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org> conflicts: Documentation/admin-guide/kernel-parameters.txt arch/arm64/include/asm/setup.h arch/arm64/mm/mmu.c init/main.c Signed-off-by: NXia Longlong <xialonglong1@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 0994c64e category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0994c64eb4159ba019e7fedc7ba0dd6a69235b40 -------------------------------- Since it is now possible for a tagset to share a single set of tags, the iter function should not re-iter the tags for the count of #hw queues in that case. Rather it should just iter once. Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Reported-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Link: https://lore.kernel.org/r/1634550083-202815-1-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 8bdf7b3f category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8bdf7b3fe1f48a2c1c212d4685903bba01409c0e -------------------------------- We should not reference the queue tagset in blk_mq_sched_tags_teardown() (see function comment) for the blk-mq flags, so use the passed flags instead. This solves a use-after-free, similarly fixed earlier (and since broken again) in commit f0c1c4d2 ("blk-mq: fix use-after-free in blk_mq_exit_sched"). Reported-by: NLinux Kernel Functional Testing <lkft@linaro.org> Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Tested-by: NAnders Roxell <anders.roxell@linaro.org> Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Signed-off-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1634890340-15432-1-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-mq-sched.c Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.17-rc1 commit fea9f92f category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fea9f92f1748083cb82049ed503be30c3d3a9b69 -------------------------------- Kashyap reports high CPU usage in blk_mq_queue_tag_busy_iter() and callees using megaraid SAS RAID card since moving to shared tags [0]. Previously, when shared tags was shared sbitmap, this function was less than optimum since we would iter through all tags for all hctx's, yet only ever match upto tagset depth number of rqs. Since the change to shared tags, things are even less efficient if we have parallel callers of blk_mq_queue_tag_busy_iter(). This is because in bt_iter() -> blk_mq_find_and_get_req() there would be more contention on accessing each request ref and tags->lock since they are now shared among all HW queues. Optimise by having separate calls to bt_for_each() for when we're using shared tags. In this case no longer pass a hctx, as it is no longer relevant, and teach bt_iter() about this. Ming suggested something along the lines of this change, apart from a different implementation. [0] https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reported-and-tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Link: https://lore.kernel.org/r/1638794990-137490-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-mq-tag.c Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit ae0f1a73 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae0f1a732f4a5db284e2af02c305255734efd19c -------------------------------- Now that we use shared tags for shared sbitmap support, we don't require the tags sbitmap pointers, so drop them. This essentially reverts commit 222a5ae0 ("blk-mq: Use pointers for blk_mq_tags bitmap tags"). Function blk_mq_init_bitmap_tags() is removed also, since it would be only a wrappper for blk_mq_init_bitmaps(). Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1633429419-228500-14-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit e155b0c2 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e155b0c238b20f0a866f4334d292656665836c8a -------------------------------- Currently we use separate sbitmap pairs and active_queues atomic_t for shared sbitmap support. However a full sets of static requests are used per HW queue, which is quite wasteful, considering that the total number of requests usable at any given time across all HW queues is limited by the shared sbitmap depth. As such, it is considerably more memory efficient in the case of shared sbitmap to allocate a set of static rqs per tag set or request queue, and not per HW queue. So replace the sbitmap pairs and active_queues atomic_t with a shared tags per tagset and request queue, which will hold a set of shared static rqs. Since there is now no valid HW queue index to be passed to the blk_mq_ops .init and .exit_request callbacks, pass an invalid index token. This changes the semantics of the APIs, such that the callback would need to validate the HW queue index before using it. Currently no user of shared sbitmap actually uses the HW queue index (as would be expected). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-13-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: c6f9c0e2 ("blk-mq: allow hardware queue to get more tag while sharing a tag set") is merged, which will cause lots of conflicts for this patch, and in the meantime, the functionality will need to be adapted in that patch. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Nikolay Borisov 提交于
mainline inclusion from mainline-v5.13-rc1 commit 39aa56db category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39aa56db50b9ca5cad597e561b4b160b6cbbb65b -------------------------------- Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20210311081713.2763171-1-nborisov@suse.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 645db34e category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=645db34e50501aac141713fb47a315e5202ff890 -------------------------------- Refactor blk_mq_free_map_and_requests() such that it can be used at many sites at which the tag map and rqs are freed. Also rename to blk_mq_free_map_and_rqs(), which is shorter and matches the alloc equivalent. Suggested-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-12-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: commit a846a8e6 ("blk-mq: don't free tags if the tag_set is used by other device in queue initialztion") is already backported, blk_mq_free_map_and_rqs() is moved to __blk_mq_update_nr_hw_queues() instead of blk_mq_realloc_hw_ctxs(). Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 63064be1 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=63064be150e4b1ba1e4af594ef5aa81adf21a52d -------------------------------- Add a function to combine allocating tags and the associated requests, and factor out common patterns to use this new function. Some function only call blk_mq_alloc_map_and_rqs() now, but more functionality will be added later. Also make blk_mq_alloc_rq_map() and blk_mq_alloc_rqs() static since they are only used in blk-mq.c, and finally rename some functions for conciseness and consistency with other function names: - __blk_mq_alloc_map_and_{request -> rqs}() - blk_mq_alloc_{map_and_requests -> set_map_and_rqs}() Suggested-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-11-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit a7e7388d category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7e7388dced47a10ca13ae95ca975ea2830f196b -------------------------------- Put the functionality to update the sched shared sbitmap size in a common function. Since the same formula is always used to resize, and it can be got from the request queue argument, so just pass the request queue pointer. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-10-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 4f245d5b category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4f245d5bf0f7432c881e22a77066160a6cba8e03 -------------------------------- Function blk_mq_clear_rq_mapping() is required to clear the sched tags mappings in driver tags rqs[]. But there is no need for a driver tags to clear its own mapping, so skip clearing the mapping in this scenario. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-9-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit f32e4eaf category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f32e4eafaf29aa493bdea3123dac09ad135b599b -------------------------------- Function blk_mq_clear_rq_mapping() will be used for shared sbitmap tags in future, so pass a driver tags pointer instead of the tagset container and HW queue index. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-8-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 1820f4f0 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1820f4f0a5e75633ff384167027bf45ed1b10234 -------------------------------- To be more concise and consistent in naming, rename blk_mq_sched_free_requests() -> blk_mq_sched_free_rqs(). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-7-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit d99a6bb3 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d99a6bb337677d812d5bef7795c9fcf17f1ccebe -------------------------------- Function blk_mq_sched_alloc_tags() does same as __blk_mq_alloc_map_and_request(), so give a similar name to be consistent. Similarly rename label err_free_tags -> err_free_map_and_rqs. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-6-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.19-rc1 commit f6adcef5 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f6adcef5f317c78a899ca9766e90e47026834158 -------------------------------- It's easier to read: if (x) X; else Y; over: if (!x) Y; else X; No functional change intended. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-5-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 8fa04464 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8fa044640f128c0cd015f72cda42989a664889fc -------------------------------- For shared sbitmap, if the call to blk_mq_tag_update_depth() was successful for any hctx when hctx->sched_tags is not set, then it would be successful for all (due to nature in which blk_mq_tag_update_depth() fails). As such, there is no need to call blk_mq_tag_resize_shared_sbitmap() for each hctx. So relocate the call until after the hctx iteration under the !q->elevator check, which is equivalent (to !hctx->sched_tags). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 65de57bb category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=65de57bb2e66f1fbede166c1307570ebd09eae83 -------------------------------- The original code in commit 24d2f903 ("blk-mq: split out tag initialization, support shared tags") would check tags->rqs is non-NULL and then dereference tags->rqs[]. Then in commit 2af8cbe3 ("blk-mq: split tag ->rqs[] into two"), we started to dereference tags->static_rqs[], but continued to check non-NULL tags->rqs. Check tags->static_rqs as non-NULL instead, which is more logical. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-2-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-