- 03 11月, 2022 29 次提交
-
-
由 Pablo Neira Ayuso 提交于
stable inclusion from stable-v5.10.146 commit 5d75fef3e61e797fab5c3fbba88caa74ab92ad47 category: bugfix bugzilla: 187890, https://gitee.com/src-openeuler/kernel/issues/I5X2IJ CVE: CVE-2022-42432 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5d75fef3e61e797fab5c3fbba88caa74ab92ad47 -------------------------------- [ Upstream commit 559c36c5 ] nf_osf_find() incorrectly returns true on mismatch, this leads to copying uninitialized memory area in nft_osf which can be used to leak stale kernel stack data to userspace. Fixes: 22c7652c ("netfilter: nft_osf: Add version option support") Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NLu Wei <luwei32@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ryusuke Konishi 提交于
mainline inclusion from mainline-v6.1-rc1 commit d0d51a97 category: bugfix bugzilla: 187884,https://gitee.com/src-openeuler/kernel/issues/I5X2OB CVE: CVE-2022-3646 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0d51a97063db4704a5ef6bc978dddab1636a306 -------------------------------- If nilfs_attach_log_writer() failed to create a log writer thread, it frees a data structure of the log writer without any cleanup. After commit e912a5b6 ("nilfs2: use root object to get ifile"), this causes a leak of struct nilfs_root, which started to leak an ifile metadata inode and a kobject on that struct. In addition, if the kernel is booted with panic_on_warn, the above ifile metadata inode leak will cause the following panic when the nilfs2 kernel module is removed: kmem_cache_destroy nilfs2_inode_cache: Slab cache still has objects when called from nilfs_destroy_cachep+0x16/0x3a [nilfs2] WARNING: CPU: 8 PID: 1464 at mm/slab_common.c:494 kmem_cache_destroy+0x138/0x140 ... RIP: 0010:kmem_cache_destroy+0x138/0x140 Code: 00 20 00 00 e8 a9 55 d8 ff e9 76 ff ff ff 48 8b 53 60 48 c7 c6 20 70 65 86 48 c7 c7 d8 69 9c 86 48 8b 4c 24 28 e8 ef 71 c7 00 <0f> 0b e9 53 ff ff ff c3 48 81 ff ff 0f 00 00 77 03 31 c0 c3 53 48 ... Call Trace: <TASK> ? nilfs_palloc_freev.cold.24+0x58/0x58 [nilfs2] nilfs_destroy_cachep+0x16/0x3a [nilfs2] exit_nilfs_fs+0xa/0x1b [nilfs2] __x64_sys_delete_module+0x1d9/0x3a0 ? __sanitizer_cov_trace_pc+0x1a/0x50 ? syscall_trace_enter.isra.19+0x119/0x190 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd ... </TASK> Kernel panic - not syncing: panic_on_warn set ... This patch fixes these issues by calling nilfs_detach_log_writer() cleanup function if spawning the log writer thread fails. Link: https://lkml.kernel.org/r/20221007085226.57667-1-konishi.ryusuke@gmail.com Fixes: e912a5b6 ("nilfs2: use root object to get ifile") Signed-off-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+7381dc4ad60658ca4c05@syzkaller.appspotmail.com Tested-by: NRyusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Russell King (Oracle) 提交于
mainline inclusion from mainline-v6.0-rc7 commit 0152dfee category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5W7AX CVE: CVE-2022-3535 -------------------------------- When mvpp2 is unloaded, the driver specific debugfs directory is not removed, which technically leads to a memory leak. However, this directory is only created when the first device is probed, so the hardware is present. Removing the module is only something a developer would to when e.g. testing out changes, so the module would be reloaded. So this memory leak is minor. The original attempt in commit fe2c9c61 ("net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()") that was labelled as a memory leak fix was not, it fixed a refcount leak, but in doing so created a problem when the module is reloaded - the directory already exists, but mvpp2_root is NULL, so we lose all debugfs entries. This fix has been reverted. This is the alternative fix, where we remove the offending directory whenever the driver is unloaded. Fixes: 21da57a2 ("net: mvpp2: add a debugfs interface for the Header Parser") Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NMarcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/r/E1ofOAB-00CzkG-UO@rmk-PC.armlinux.org.ukSigned-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: Nzhangqiao <zhangqiao22@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Chen Zhongjin 提交于
mainline inclusion from mainline-v6.0-rc3 commit fc2e426b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q4RA CVE: NA -------------------------------- When meeting ftrace trampolines in ORC unwinding, unwinder uses address of ftrace_{regs_}call address to find the ORC entry, which gets next frame at sp+176. If there is an IRQ hitting at sub $0xa8,%rsp, the next frame should be sp+8 instead of 176. It makes unwinder skip correct frame and throw warnings such as "wrong direction" or "can't access registers", etc, depending on the content of the incorrect frame address. By adding the base address ftrace_{regs_}caller with the offset *ip - ops->trampoline*, we can get the correct address to find the ORC entry. Also change "caller" to "tramp_addr" to make variable name conform to its content. [ mingo: Clarified the changelog a bit. ] Fixes: 6be7fa3c ("ftrace, orc, x86: Handle ftrace dynamically allocated trampolines") Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NSteven Rostedt (Google) <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220819084334.244016-1-chenzhongjin@huawei.comSigned-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kuniyuki Iwashima 提交于
mainline inclusion from mainline-v6.0-rc3 commit 9c80e799 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q4Z0 CVE: NA -------------------------------- The assumption in __disable_kprobe() is wrong, and it could try to disarm an already disarmed kprobe and fire the WARN_ONCE() below. [0] We can easily reproduce this issue. 1. Write 0 to /sys/kernel/debug/kprobes/enabled. # echo 0 > /sys/kernel/debug/kprobes/enabled 2. Run execsnoop. At this time, one kprobe is disabled. # /usr/share/bcc/tools/execsnoop & [1] 2460 PCOMM PID PPID RET ARGS # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 3. Write 1 to /sys/kernel/debug/kprobes/enabled, which changes kprobes_all_disarmed to false but does not arm the disabled kprobe. # echo 1 > /sys/kernel/debug/kprobes/enabled # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 4. Kill execsnoop, when __disable_kprobe() calls disarm_kprobe() for the disabled kprobe and hits the WARN_ONCE() in __disarm_kprobe_ftrace(). # fg /usr/share/bcc/tools/execsnoop ^C Actually, WARN_ONCE() is fired twice, and __unregister_kprobe_top() misses some cleanups and leaves the aggregated kprobe in the hash table. Then, __unregister_trace_kprobe() initialises tk->rp.kp.list and creates an infinite loop like this. aggregated kprobe.list -> kprobe.list -. ^ | '.__.' In this situation, these commands fall into the infinite loop and result in RCU stall or soft lockup. cat /sys/kernel/debug/kprobes/list : show_kprobe_addr() enters into the infinite loop with RCU. /usr/share/bcc/tools/execsnoop : warn_kprobe_rereg() holds kprobe_mutex, and __get_valid_kprobe() is stuck in the loop. To avoid the issue, make sure we don't call disarm_kprobe() for disabled kprobes. [0] Failed to disarm kprobe-ftrace at __x64_sys_execve+0x0/0x40 (error -2) WARNING: CPU: 6 PID: 2460 at kernel/kprobes.c:1130 __disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Modules linked in: ena CPU: 6 PID: 2460 Comm: execsnoop Not tainted 5.19.0+ #28 Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:__disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Code: 24 8b 02 eb c1 80 3d c4 83 f2 01 00 75 d4 48 8b 75 00 89 c2 48 c7 c7 90 fa 0f 92 89 04 24 c6 05 ab 83 01 e8 e4 94 f0 ff <0f> 0b 8b 04 24 eb b1 89 c6 48 c7 c7 60 fa 0f 92 89 04 24 e8 cc 94 RSP: 0018:ffff9e6ec154bd98 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff930f7b00 RCX: 0000000000000001 RDX: 0000000080000001 RSI: ffffffff921461c5 RDI: 00000000ffffffff RBP: ffff89c504286da8 R08: 0000000000000000 R09: c0000000fffeffff R10: 0000000000000000 R11: ffff9e6ec154bc28 R12: ffff89c502394e40 R13: ffff89c502394c00 R14: ffff9e6ec154bc00 R15: 0000000000000000 FS: 00007fe800398740(0000) GS:ffff89c812d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c00057f010 CR3: 0000000103b54006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> __disable_kprobe (kernel/kprobes.c:1716) disable_kprobe (kernel/kprobes.c:2392) __disable_trace_kprobe (kernel/trace/trace_kprobe.c:340) disable_trace_kprobe (kernel/trace/trace_kprobe.c:429) perf_trace_event_unreg.isra.2 (./include/linux/tracepoint.h:93 kernel/trace/trace_event_perf.c:168) perf_kprobe_destroy (kernel/trace/trace_event_perf.c:295) _free_event (kernel/events/core.c:4971) perf_event_release_kernel (kernel/events/core.c:5176) perf_release (kernel/events/core.c:5186) __fput (fs/file_table.c:321) task_work_run (./include/linux/sched.h:2056 (discriminator 1) kernel/task_work.c:179 (discriminator 1)) exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49 kernel/entry/common.c:169 kernel/entry/common.c:201) syscall_exit_to_user_mode (./arch/x86/include/asm/jump_label.h:55 ./arch/x86/include/asm/nospec-branch.h:384 ./arch/x86/include/asm/entry-common.h:94 kernel/entry/common.c:133 kernel/entry/common.c:296) do_syscall_64 (arch/x86/entry/common.c:87) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7fe7ff210654 Code: 15 79 89 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb be 0f 1f 00 8b 05 9a cd 20 00 48 63 ff 85 c0 75 11 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3a f3 c3 48 83 ec 18 48 89 7c 24 08 e8 34 fc RSP: 002b:00007ffdbd1d3538 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007fe7ff210654 RDX: 0000000000000000 RSI: 0000000000002401 RDI: 0000000000000008 RBP: 0000000000000000 R08: 94ae31d6fda838a4 R0900007fe8001c9d30 R10: 00007ffdbd1d34b0 R11: 0000000000000246 R12: 00007ffdbd1d3600 R13: 0000000000000000 R14: fffffffffffffffc R15: 00007ffdbd1d3560 </TASK> Link: https://lkml.kernel.org/r/20220813020509.90805-1-kuniyu@amazon.com Fixes: 69d54b91 ("kprobes: makes kprobes/enabled works correctly for optimized kprobes.") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Reported-by: NAyushman Dutta <ayudutta@amazon.com> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Kuniyuki Iwashima <kuni1840@gmail.com> Cc: Ayushman Dutta <ayudutta@amazon.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X6RT CVE: NA -------------------------------- After introducing commit 5b18b5a7 ("block: delete part_round_stats and switch to less precise counting"), '%util' accounted by iostat will be over reality data. In fact, the device is quite idle, but iostat may show '%util' as a big number (e.g. 50%). It can produce by fio: fio --name=1 --direct=1 --bs=4k --rw=read --filename=/dev/sda \ --thinktime=4ms --runtime=180 We fix this by using a switch(precise_iostat=1) to control whether or not acconut ioticks precisely. fixes: 5b18b5a7 ("block: delete part_round_stats and switch to less precise counting") Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Mark Rutland 提交于
mainline inclusion from mainline-v6.0-rc3 commit 2e8cff0a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5PPS3?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2e8cff0a0eee87b27f0cf87ad8310eb41b5886ab -------------------------------- On arm64, "rodata=full" has been suppored (but not documented) since commit: c55191e9 ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well") As it's necessary to determine the rodata configuration early during boot, arm64 has an early_param() handler for this, whereas init/main.c has a __setup() handler which is run later. Unfortunately, this split meant that since commit: f9a40b08 ("init/main.c: return 1 from handled __setup() functions") ... passing "rodata=full" would result in a spurious warning from the __setup() handler (though RO permissions would be configured appropriately). Further, "rodata=full" has been broken since commit: 0d6ea3ac ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()") ... which caused strtobool() to parse "full" as false (in addition to many other values not documented for the "rodata=" kernel parameter. This patch fixes this breakage by: * Moving the core parameter parser to an __early_param(), such that it is available early. * Adding an (optional) arch hook which arm64 can use to parse "full". * Updating the documentation to mention that "full" is valid for arm64. * Having the core parameter parser handle "on" and "off" explicitly, such that any undocumented values (e.g. typos such as "ful") are reported as errors rather than being silently accepted. Note that __setup() and early_param() have opposite conventions for their return values, where __setup() uses 1 to indicate a parameter was handled and early_param() uses 0 to indicate a parameter was handled. Fixes: f9a40b08 ("init/main.c: return 1 from handled __setup() functions") Fixes: 0d6ea3ac ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()") Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jagdish Gediya <jvgediya@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: NArd Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220817154022.3974645-1-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org> conflicts: Documentation/admin-guide/kernel-parameters.txt arch/arm64/include/asm/setup.h arch/arm64/mm/mmu.c init/main.c Signed-off-by: NXia Longlong <xialonglong1@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 0994c64e category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0994c64eb4159ba019e7fedc7ba0dd6a69235b40 -------------------------------- Since it is now possible for a tagset to share a single set of tags, the iter function should not re-iter the tags for the count of #hw queues in that case. Rather it should just iter once. Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Reported-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Link: https://lore.kernel.org/r/1634550083-202815-1-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 8bdf7b3f category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8bdf7b3fe1f48a2c1c212d4685903bba01409c0e -------------------------------- We should not reference the queue tagset in blk_mq_sched_tags_teardown() (see function comment) for the blk-mq flags, so use the passed flags instead. This solves a use-after-free, similarly fixed earlier (and since broken again) in commit f0c1c4d2 ("blk-mq: fix use-after-free in blk_mq_exit_sched"). Reported-by: NLinux Kernel Functional Testing <lkft@linaro.org> Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Tested-by: NAnders Roxell <anders.roxell@linaro.org> Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Signed-off-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1634890340-15432-1-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-mq-sched.c Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.17-rc1 commit fea9f92f category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fea9f92f1748083cb82049ed503be30c3d3a9b69 -------------------------------- Kashyap reports high CPU usage in blk_mq_queue_tag_busy_iter() and callees using megaraid SAS RAID card since moving to shared tags [0]. Previously, when shared tags was shared sbitmap, this function was less than optimum since we would iter through all tags for all hctx's, yet only ever match upto tagset depth number of rqs. Since the change to shared tags, things are even less efficient if we have parallel callers of blk_mq_queue_tag_busy_iter(). This is because in bt_iter() -> blk_mq_find_and_get_req() there would be more contention on accessing each request ref and tags->lock since they are now shared among all HW queues. Optimise by having separate calls to bt_for_each() for when we're using shared tags. In this case no longer pass a hctx, as it is no longer relevant, and teach bt_iter() about this. Ming suggested something along the lines of this change, apart from a different implementation. [0] https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reported-and-tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Link: https://lore.kernel.org/r/1638794990-137490-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-mq-tag.c Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit ae0f1a73 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae0f1a732f4a5db284e2af02c305255734efd19c -------------------------------- Now that we use shared tags for shared sbitmap support, we don't require the tags sbitmap pointers, so drop them. This essentially reverts commit 222a5ae0 ("blk-mq: Use pointers for blk_mq_tags bitmap tags"). Function blk_mq_init_bitmap_tags() is removed also, since it would be only a wrappper for blk_mq_init_bitmaps(). Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1633429419-228500-14-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit e155b0c2 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e155b0c238b20f0a866f4334d292656665836c8a -------------------------------- Currently we use separate sbitmap pairs and active_queues atomic_t for shared sbitmap support. However a full sets of static requests are used per HW queue, which is quite wasteful, considering that the total number of requests usable at any given time across all HW queues is limited by the shared sbitmap depth. As such, it is considerably more memory efficient in the case of shared sbitmap to allocate a set of static rqs per tag set or request queue, and not per HW queue. So replace the sbitmap pairs and active_queues atomic_t with a shared tags per tagset and request queue, which will hold a set of shared static rqs. Since there is now no valid HW queue index to be passed to the blk_mq_ops .init and .exit_request callbacks, pass an invalid index token. This changes the semantics of the APIs, such that the callback would need to validate the HW queue index before using it. Currently no user of shared sbitmap actually uses the HW queue index (as would be expected). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-13-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: c6f9c0e2 ("blk-mq: allow hardware queue to get more tag while sharing a tag set") is merged, which will cause lots of conflicts for this patch, and in the meantime, the functionality will need to be adapted in that patch. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Nikolay Borisov 提交于
mainline inclusion from mainline-v5.13-rc1 commit 39aa56db category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39aa56db50b9ca5cad597e561b4b160b6cbbb65b -------------------------------- Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20210311081713.2763171-1-nborisov@suse.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 645db34e category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=645db34e50501aac141713fb47a315e5202ff890 -------------------------------- Refactor blk_mq_free_map_and_requests() such that it can be used at many sites at which the tag map and rqs are freed. Also rename to blk_mq_free_map_and_rqs(), which is shorter and matches the alloc equivalent. Suggested-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-12-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: commit a846a8e6 ("blk-mq: don't free tags if the tag_set is used by other device in queue initialztion") is already backported, blk_mq_free_map_and_rqs() is moved to __blk_mq_update_nr_hw_queues() instead of blk_mq_realloc_hw_ctxs(). Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 63064be1 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=63064be150e4b1ba1e4af594ef5aa81adf21a52d -------------------------------- Add a function to combine allocating tags and the associated requests, and factor out common patterns to use this new function. Some function only call blk_mq_alloc_map_and_rqs() now, but more functionality will be added later. Also make blk_mq_alloc_rq_map() and blk_mq_alloc_rqs() static since they are only used in blk-mq.c, and finally rename some functions for conciseness and consistency with other function names: - __blk_mq_alloc_map_and_{request -> rqs}() - blk_mq_alloc_{map_and_requests -> set_map_and_rqs}() Suggested-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-11-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit a7e7388d category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7e7388dced47a10ca13ae95ca975ea2830f196b -------------------------------- Put the functionality to update the sched shared sbitmap size in a common function. Since the same formula is always used to resize, and it can be got from the request queue argument, so just pass the request queue pointer. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-10-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 4f245d5b category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4f245d5bf0f7432c881e22a77066160a6cba8e03 -------------------------------- Function blk_mq_clear_rq_mapping() is required to clear the sched tags mappings in driver tags rqs[]. But there is no need for a driver tags to clear its own mapping, so skip clearing the mapping in this scenario. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-9-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit f32e4eaf category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f32e4eafaf29aa493bdea3123dac09ad135b599b -------------------------------- Function blk_mq_clear_rq_mapping() will be used for shared sbitmap tags in future, so pass a driver tags pointer instead of the tagset container and HW queue index. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-8-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 1820f4f0 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1820f4f0a5e75633ff384167027bf45ed1b10234 -------------------------------- To be more concise and consistent in naming, rename blk_mq_sched_free_requests() -> blk_mq_sched_free_rqs(). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-7-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit d99a6bb3 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d99a6bb337677d812d5bef7795c9fcf17f1ccebe -------------------------------- Function blk_mq_sched_alloc_tags() does same as __blk_mq_alloc_map_and_request(), so give a similar name to be consistent. Similarly rename label err_free_tags -> err_free_map_and_rqs. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-6-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.19-rc1 commit f6adcef5 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f6adcef5f317c78a899ca9766e90e47026834158 -------------------------------- It's easier to read: if (x) X; else Y; over: if (!x) Y; else X; No functional change intended. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-5-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 8fa04464 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8fa044640f128c0cd015f72cda42989a664889fc -------------------------------- For shared sbitmap, if the call to blk_mq_tag_update_depth() was successful for any hctx when hctx->sched_tags is not set, then it would be successful for all (due to nature in which blk_mq_tag_update_depth() fails). As such, there is no need to call blk_mq_tag_resize_shared_sbitmap() for each hctx. So relocate the call until after the hctx iteration under the !q->elevator check, which is equivalent (to !hctx->sched_tags). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 65de57bb category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=65de57bb2e66f1fbede166c1307570ebd09eae83 -------------------------------- The original code in commit 24d2f903 ("blk-mq: split out tag initialization, support shared tags") would check tags->rqs is non-NULL and then dereference tags->rqs[]. Then in commit 2af8cbe3 ("blk-mq: split tag ->rqs[] into two"), we started to dereference tags->static_rqs[], but continued to check non-NULL tags->rqs. Check tags->static_rqs as non-NULL instead, which is more logical. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-2-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- This reverts commit 26eda960. The related feilds will be modified in later patches which are backported from mainline. Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5N162 CVE: NA -------------------------------- In the use of q_usage_counter of request_queue, blk_cleanup_queue using "wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))" to wait q_usage_counter becoming zero. however, if the q_usage_counter becoming zero quickly, and percpu_ref_exit will execute and ref->data will be freed, maybe another process will cause a null-defef problem like below: CPU0 CPU1 blk_cleanup_queue blk_freeze_queue blk_mq_freeze_queue_wait scsi_end_request percpu_ref_get ... percpu_ref_put atomic_long_sub_and_test percpu_ref_exit ref->data -> NULL ref->data->release(ref) -> null-deref Fix it by setting flag(QUEUE_FLAG_USAGE_COUNT_SYNC) to add synchronization mechanism, when ref->data->release is called, the flag will be setted, and the "wait_event" in blk_mq_freeze_queue_wait must wait flag becoming true as well, which will limit percpu_ref_exit to execute ahead of time. Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @x56Jason This PR is to enable IPI virtualization support for Intel SPR, meanwhile fix kabi changes. ## Intel-Kernel Issue #I5ODSC ## Test 1, build and boot succeed with CONFIG_SMP enabled or disabled. 2, Use IPI benchmark (https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com) to do unicast IPI testing ("Normal IPI" case in the benchmark): - With host disabled IPI virtualization, and guest enable x2apic mode, we can see a lot of MSR_WR_VMEXITs - With host enabled IPI virtualization, and guest enable x2apic mode, MSR_WR_VMEXITs caused by IPI disappears. ## Known Issue N/A ## Default Config Change N/A Link:https://gitee.com/openeuler/kernel/pulls/161 Reviewed-by: Chen Wei <chenwei@xfusion.com> Reviewed-by: Kevin Zhu <zhukeqian1@huawei.com> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
-
- 02 11月, 2022 11 次提交
-
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 6849ed81a33ae616bfadf40e5c68af265d106dff category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6849ed81a33ae616bfadf40e5c68af265d106dff -------------------------------- commit 68cf4f2a upstream. In order to further enable commit: bbe2df3f ("x86/alternative: Try inline spectre_v2=retpoline,amd") add the new GCC flag -mindirect-branch-cs-prefix: https://gcc.gnu.org/g:2196a681d7810ad8b227bf983f38ba716620545e https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952 https://bugs.llvm.org/show_bug.cgi?id=52323 to RETPOLINE=y builds. This should allow fully inlining retpoline,amd for GCC builds. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NKees Cook <keescook@chromium.org> Acked-by: NNick Desaulniers <ndesaulniers@google.com> Link: https://lkml.kernel.org/r/20211119165630.276205624@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Jiri Slaby 提交于
stable inclusion from stable-v5.10.133 commit ecc0d92a9f6cc3f74b67d2c9887d0c800018e661 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ecc0d92a9f6cc3f74b67d2c9887d0c800018e661 -------------------------------- commit 3131ef39 upstream. The build on x86_32 currently fails after commit 9bb2ec60 (objtool: Update Retpoline validation) with: arch/x86/kernel/../../x86/xen/xen-head.S:35: Error: no such instruction: `annotate_unret_safe' ANNOTATE_UNRET_SAFE is defined in nospec-branch.h. And head_32.S is missing this include. Fix this. Fixes: 9bb2ec60 ("objtool: Update Retpoline validation") Signed-off-by: NJiri Slaby <jslaby@suse.cz> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/63e23f80-033f-f64e-7522-2816debbc367@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Mikulas Patocka 提交于
stable inclusion from stable-v5.10.133 commit 236b959da9d145c311f9daa65e3fbbe1a3c9c9af category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=236b959da9d145c311f9daa65e3fbbe1a3c9c9af -------------------------------- commit 22682a07 upstream. Commit c087c6e7 ("objtool: Fix type of reloc::addend") failed to appreciate cross building from ILP32 hosts, where 'int' == 'long' and the issue persists. As such, use s64/int64_t/Elf64_Sxword for this field and suffer the pain that is ISO C99 printf formats for it. Fixes: c087c6e7 ("objtool: Fix type of reloc::addend") Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> [peterz: reword changelog, s/long long/s64/] Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2205161041260.11556@file01.intranet.prod.int.rdu2.redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit e1db6c8a69ec74aa0ecff19857895d10f39c6d98 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e1db6c8a69ec74aa0ecff19857895d10f39c6d98 -------------------------------- commit ead165fa upstream. Nathan reported objtool failing with the following messages: warning: objtool: no non-local symbols !? warning: objtool: gelf_update_symshndx: invalid section index The problem is due to commit 4abff6d4 ("objtool: Fix code relocs vs weak symbols") failing to consider the case where an object would have no non-local symbols. The problem that commit tries to address is adding a STB_LOCAL symbol to the symbol table in light of the ELF spec's requirement that: In each symbol table, all symbols with STB_LOCAL binding preced the weak and global symbols. As ``Sections'' above describes, a symbol table section's sh_info section header member holds the symbol table index for the first non-local symbol. The approach taken is to find this first non-local symbol, move that to the end and then re-use the freed spot to insert a new local symbol and increment sh_info. Except it never considered the case of object files without global symbols and got a whole bunch of details wrong -- so many in fact that it is a wonder it ever worked :/ Specifically: - It failed to re-hash the symbol on the new index, so a subsequent find_symbol_by_index() would not find it at the new location and a query for the old location would now return a non-deterministic choice between the old and new symbol. - It failed to appreciate that the GElf wrappers are not a valid disk format (it works because GElf is basically Elf64 and we only support x86_64 atm.) - It failed to fully appreciate how horrible the libelf API really is and got the gelf_update_symshndx() call pretty much completely wrong; with the direct consequence that if inserting a second STB_LOCAL symbol would require moving the same STB_GLOBAL symbol again it would completely come unstuck. Write a new elf_update_symbol() function that wraps all the magic required to update or create a new symbol at a given index. Specifically, gelf_update_sym*() require an @ndx argument that is relative to the @data argument; this means you have to manually iterate the section data descriptor list and update @ndx. Fixes: 4abff6d4 ("objtool: Fix code relocs vs weak symbols") Reported-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@kernel.org> Tested-by: NNathan Chancellor <nathan@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/YoPCTEYjoPqE4ZxB@hirez.programming.kicks-ass.netSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 5.10: elf_hash_add() takes a hash table pointer, not just a name] Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 3e8afd072d098958a507fb10251a201c9899150c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3e8afd072d098958a507fb10251a201c9899150c -------------------------------- commit c087c6e7 upstream. Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means that our reloc::addend needs to be long or face tuncation issues when we do elf_rebuild_reloc_section(): - 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt+0x80000067 + 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt-0x7fffff99 Fixes: 627fce14 ("objtool: Add ORC unwind table generation") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.596871927@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 42ec4d71353f4c6da1d94baf64acbb1badc16b11 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=42ec4d71353f4c6da1d94baf64acbb1badc16b11 -------------------------------- commit 4abff6d4 upstream. Occasionally objtool driven code patching (think .static_call_sites .retpoline_sites etc..) goes sideways and it tries to patch an instruction that doesn't match. Much head-scatching and cursing later the problem is as outlined below and affects every section that objtool generates for us, very much including the ORC data. The below uses .static_call_sites because it's convenient for demonstration purposes, but as mentioned the ORC sections, .retpoline_sites and __mount_loc are all similarly affected. Consider: foo-weak.c: extern void __SCT__foo(void); __attribute__((weak)) void foo(void) { return __SCT__foo(); } foo.c: extern void __SCT__foo(void); extern void my_foo(void); void foo(void) { my_foo(); return __SCT__foo(); } These generate the obvious code (gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c): foo-weak.o: 0000000000000000 <foo>: 0: e9 00 00 00 00 jmpq 5 <foo+0x5> 1: R_X86_64_PLT32 __SCT__foo-0x4 foo.o: 0000000000000000 <foo>: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <foo+0x9> 5: R_X86_64_PLT32 my_foo-0x4 9: 48 83 c4 08 add $0x8,%rsp d: e9 00 00 00 00 jmpq 12 <foo+0x12> e: R_X86_64_PLT32 __SCT__foo-0x4 Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o): foos.o: 0000000000000000 <foo-0x10>: 0: e9 00 00 00 00 jmpq 5 <foo-0xb> 1: R_X86_64_PLT32 __SCT__foo-0x4 5: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1) f: 90 nop 0000000000000010 <foo>: 10: 48 83 ec 08 sub $0x8,%rsp 14: e8 00 00 00 00 callq 19 <foo+0x9> 15: R_X86_64_PLT32 my_foo-0x4 19: 48 83 c4 08 add $0x8,%rsp 1d: e9 00 00 00 00 jmpq 22 <foo+0x12> 1e: R_X86_64_PLT32 __SCT__foo-0x4 Noting that ld preserves the weak function text, but strips the symbol off of it (hence objdump doing that funny negative offset thing). This does lead to 'interesting' unused code issues with objtool when ran on linked objects, but that seems to be working (fingers crossed). So far so good.. Now lets consider the objtool static_call output section (readelf output, old binutils): foo-weak.o: Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foo.o: Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + d 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foos.o: Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 1d 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 So we have two patch sites, one in the dead code of the weak foo and one in the real foo. All is well. *HOWEVER*, when the toolchain strips unused section symbols it generates things like this (using new enough binutils): foo-weak.o: Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foo.o: Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + d 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foos.o: Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 foo + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 foo + d 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 And now we can see how that foos.o .static_call_sites goes side-ways, we now have _two_ patch sites in foo. One for the weak symbol at foo+0 (which is no longer a static_call site!) and one at foo+d which is in fact the right location. This seems to happen when objtool cannot find a section symbol, in which case it falls back to any other symbol to key off of, however in this case that goes terribly wrong! As such, teach objtool to create a section symbol when there isn't one. Fixes: 44f6a7c0 ("objtool: Fix seg fault with Clang non-section symbols") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 38a80a3ca2cb069dd5608703b015a206a672aae5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=38a80a3ca2cb069dd5608703b015a206a672aae5 -------------------------------- commit d4b5a5c9 upstream. Make sure we can see the text changes when booting with 'debug-alternative'. Example output: [ ] SMP alternatives: retpoline at: __traceiter_initcall_level+0x1f/0x30 (ffffffff8100066f) len: 5 to: __x86_indirect_thunk_rax+0x0/0x20 [ ] SMP alternatives: ffffffff82603e58: [2:5) optimized NOPs: ff d0 0f 1f 00 [ ] SMP alternatives: ffffffff8100066f: orig: e8 cc 30 00 01 [ ] SMP alternatives: ffffffff8100066f: repl: ff d0 0f 1f 00 Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.422273830@infradead.orgSigned-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 3d13ee0d411a078ca1538d823c2c759b8b266fb1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3d13ee0d411a078ca1538d823c2c759b8b266fb1 -------------------------------- commit bbe2df3f upstream. Try and replace retpoline thunk calls with: LFENCE CALL *%\reg for spectre_v2=retpoline,amd. Specifically, the sequence above is 5 bytes for the low 8 registers, but 6 bytes for the high 8 registers. This means that unless the compilers prefix stuff the call with higher registers this replacement will fail. Luckily GCC strongly favours RAX for the indirect calls and most (95%+ for defconfig-x86_64) will be converted. OTOH clang strongly favours R11 and almost nothing gets converted. Note: it will also generate a correct replacement for the Jcc.d32 case, except unless the compilers start to prefix stuff that, it'll never fit. Specifically: Jncc.d8 1f LFENCE JMP *%\reg 1: is 7-8 bytes long, where the original instruction in unpadded form is only 6 bytes. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.359986601@infradead.org [cascardo: RETPOLINE_AMD was renamed to RETPOLINE_LFENCE] Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit b0e2dc950654162bc68cec530156251e7ad3f03a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b0e2dc950654162bc68cec530156251e7ad3f03a -------------------------------- commit 2f0cbb2a upstream. Handle the rare cases where the compiler (clang) does an indirect conditional tail-call using: Jcc __x86_indirect_thunk_\reg For the !RETPOLINE case this can be rewritten to fit the original (6 byte) instruction like: Jncc.d8 1f JMP *%\reg NOP 1: Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.296470217@infradead.orgSigned-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Borislav Petkov 提交于
stable inclusion from stable-v5.10.133 commit e6f8dc86a1c15b862486a61abcb54b88e8c177e3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e6f8dc86a1c15b862486a61abcb54b88e8c177e3 -------------------------------- commit 6e8c83d2 upstream. Now that the different instruction-inspecting functions return a value, test that and return early from callers if error has been encountered. While at it, do not call insn_get_modrm() when calling insn_get_displacement() because latter will make sure to call insn_get_modrm() if ModRM hasn't been parsed yet. Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210304174237.31945-6-bp@alien8.deSigned-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Juergen Gross 提交于
stable inclusion from stable-v5.10.132 commit 06a5dc3911a3b29acefd53470bdeccb88deb155e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=06a5dc3911a3b29acefd53470bdeccb88deb155e -------------------------------- commit 230ec83d upstream. x86_has_pat_wp() is using a wrong test, as it relies on the normal PAT configuration used by the kernel. In case the PAT MSR has been setup by another entity (e.g. Xen hypervisor) it might return false even if the PAT configuration is allowing WP mappings. This due to the fact that when running as Xen PV guest the PAT MSR is setup by the hypervisor and cannot be changed by the guest. This results in the WP related entry to be at a different position when running as Xen PV guest compared to the bare metal or fully virtualized case. The correct way to test for WP support is: 1. Get the PTE protection bits needed to select WP mode by reading __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR setting this might return protection bits for a stronger mode, e.g. UC-) 2. Translate those bits back into the real cache mode selected by those PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)] 3. Test for the cache mode to be _PAGE_CACHE_MODE_WP Fixes: f88a68fa ("x86/mm: Extend early_memremap() support with additional attrs") Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.14 Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-