- 03 11月, 2022 25 次提交
-
-
由 Kuniyuki Iwashima 提交于
mainline inclusion from mainline-v6.0-rc3 commit 9c80e799 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q4Z0 CVE: NA -------------------------------- The assumption in __disable_kprobe() is wrong, and it could try to disarm an already disarmed kprobe and fire the WARN_ONCE() below. [0] We can easily reproduce this issue. 1. Write 0 to /sys/kernel/debug/kprobes/enabled. # echo 0 > /sys/kernel/debug/kprobes/enabled 2. Run execsnoop. At this time, one kprobe is disabled. # /usr/share/bcc/tools/execsnoop & [1] 2460 PCOMM PID PPID RET ARGS # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 3. Write 1 to /sys/kernel/debug/kprobes/enabled, which changes kprobes_all_disarmed to false but does not arm the disabled kprobe. # echo 1 > /sys/kernel/debug/kprobes/enabled # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 4. Kill execsnoop, when __disable_kprobe() calls disarm_kprobe() for the disabled kprobe and hits the WARN_ONCE() in __disarm_kprobe_ftrace(). # fg /usr/share/bcc/tools/execsnoop ^C Actually, WARN_ONCE() is fired twice, and __unregister_kprobe_top() misses some cleanups and leaves the aggregated kprobe in the hash table. Then, __unregister_trace_kprobe() initialises tk->rp.kp.list and creates an infinite loop like this. aggregated kprobe.list -> kprobe.list -. ^ | '.__.' In this situation, these commands fall into the infinite loop and result in RCU stall or soft lockup. cat /sys/kernel/debug/kprobes/list : show_kprobe_addr() enters into the infinite loop with RCU. /usr/share/bcc/tools/execsnoop : warn_kprobe_rereg() holds kprobe_mutex, and __get_valid_kprobe() is stuck in the loop. To avoid the issue, make sure we don't call disarm_kprobe() for disabled kprobes. [0] Failed to disarm kprobe-ftrace at __x64_sys_execve+0x0/0x40 (error -2) WARNING: CPU: 6 PID: 2460 at kernel/kprobes.c:1130 __disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Modules linked in: ena CPU: 6 PID: 2460 Comm: execsnoop Not tainted 5.19.0+ #28 Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:__disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Code: 24 8b 02 eb c1 80 3d c4 83 f2 01 00 75 d4 48 8b 75 00 89 c2 48 c7 c7 90 fa 0f 92 89 04 24 c6 05 ab 83 01 e8 e4 94 f0 ff <0f> 0b 8b 04 24 eb b1 89 c6 48 c7 c7 60 fa 0f 92 89 04 24 e8 cc 94 RSP: 0018:ffff9e6ec154bd98 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff930f7b00 RCX: 0000000000000001 RDX: 0000000080000001 RSI: ffffffff921461c5 RDI: 00000000ffffffff RBP: ffff89c504286da8 R08: 0000000000000000 R09: c0000000fffeffff R10: 0000000000000000 R11: ffff9e6ec154bc28 R12: ffff89c502394e40 R13: ffff89c502394c00 R14: ffff9e6ec154bc00 R15: 0000000000000000 FS: 00007fe800398740(0000) GS:ffff89c812d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c00057f010 CR3: 0000000103b54006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> __disable_kprobe (kernel/kprobes.c:1716) disable_kprobe (kernel/kprobes.c:2392) __disable_trace_kprobe (kernel/trace/trace_kprobe.c:340) disable_trace_kprobe (kernel/trace/trace_kprobe.c:429) perf_trace_event_unreg.isra.2 (./include/linux/tracepoint.h:93 kernel/trace/trace_event_perf.c:168) perf_kprobe_destroy (kernel/trace/trace_event_perf.c:295) _free_event (kernel/events/core.c:4971) perf_event_release_kernel (kernel/events/core.c:5176) perf_release (kernel/events/core.c:5186) __fput (fs/file_table.c:321) task_work_run (./include/linux/sched.h:2056 (discriminator 1) kernel/task_work.c:179 (discriminator 1)) exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49 kernel/entry/common.c:169 kernel/entry/common.c:201) syscall_exit_to_user_mode (./arch/x86/include/asm/jump_label.h:55 ./arch/x86/include/asm/nospec-branch.h:384 ./arch/x86/include/asm/entry-common.h:94 kernel/entry/common.c:133 kernel/entry/common.c:296) do_syscall_64 (arch/x86/entry/common.c:87) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7fe7ff210654 Code: 15 79 89 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb be 0f 1f 00 8b 05 9a cd 20 00 48 63 ff 85 c0 75 11 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3a f3 c3 48 83 ec 18 48 89 7c 24 08 e8 34 fc RSP: 002b:00007ffdbd1d3538 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007fe7ff210654 RDX: 0000000000000000 RSI: 0000000000002401 RDI: 0000000000000008 RBP: 0000000000000000 R08: 94ae31d6fda838a4 R0900007fe8001c9d30 R10: 00007ffdbd1d34b0 R11: 0000000000000246 R12: 00007ffdbd1d3600 R13: 0000000000000000 R14: fffffffffffffffc R15: 00007ffdbd1d3560 </TASK> Link: https://lkml.kernel.org/r/20220813020509.90805-1-kuniyu@amazon.com Fixes: 69d54b91 ("kprobes: makes kprobes/enabled works correctly for optimized kprobes.") Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Reported-by: NAyushman Dutta <ayudutta@amazon.com> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Kuniyuki Iwashima <kuni1840@gmail.com> Cc: Ayushman Dutta <ayudutta@amazon.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: NKuohai Xu <xukuohai@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X6RT CVE: NA -------------------------------- After introducing commit 5b18b5a7 ("block: delete part_round_stats and switch to less precise counting"), '%util' accounted by iostat will be over reality data. In fact, the device is quite idle, but iostat may show '%util' as a big number (e.g. 50%). It can produce by fio: fio --name=1 --direct=1 --bs=4k --rw=read --filename=/dev/sda \ --thinktime=4ms --runtime=180 We fix this by using a switch(precise_iostat=1) to control whether or not acconut ioticks precisely. fixes: 5b18b5a7 ("block: delete part_round_stats and switch to less precise counting") Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Mark Rutland 提交于
mainline inclusion from mainline-v6.0-rc3 commit 2e8cff0a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5PPS3?from=project-issue CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2e8cff0a0eee87b27f0cf87ad8310eb41b5886ab -------------------------------- On arm64, "rodata=full" has been suppored (but not documented) since commit: c55191e9 ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well") As it's necessary to determine the rodata configuration early during boot, arm64 has an early_param() handler for this, whereas init/main.c has a __setup() handler which is run later. Unfortunately, this split meant that since commit: f9a40b08 ("init/main.c: return 1 from handled __setup() functions") ... passing "rodata=full" would result in a spurious warning from the __setup() handler (though RO permissions would be configured appropriately). Further, "rodata=full" has been broken since commit: 0d6ea3ac ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()") ... which caused strtobool() to parse "full" as false (in addition to many other values not documented for the "rodata=" kernel parameter. This patch fixes this breakage by: * Moving the core parameter parser to an __early_param(), such that it is available early. * Adding an (optional) arch hook which arm64 can use to parse "full". * Updating the documentation to mention that "full" is valid for arm64. * Having the core parameter parser handle "on" and "off" explicitly, such that any undocumented values (e.g. typos such as "ful") are reported as errors rather than being silently accepted. Note that __setup() and early_param() have opposite conventions for their return values, where __setup() uses 1 to indicate a parameter was handled and early_param() uses 0 to indicate a parameter was handled. Fixes: f9a40b08 ("init/main.c: return 1 from handled __setup() functions") Fixes: 0d6ea3ac ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()") Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jagdish Gediya <jvgediya@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: NArd Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220817154022.3974645-1-mark.rutland@arm.comSigned-off-by: NWill Deacon <will@kernel.org> conflicts: Documentation/admin-guide/kernel-parameters.txt arch/arm64/include/asm/setup.h arch/arm64/mm/mmu.c init/main.c Signed-off-by: NXia Longlong <xialonglong1@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 0994c64e category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0994c64eb4159ba019e7fedc7ba0dd6a69235b40 -------------------------------- Since it is now possible for a tagset to share a single set of tags, the iter function should not re-iter the tags for the count of #hw queues in that case. Rather it should just iter once. Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Reported-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Link: https://lore.kernel.org/r/1634550083-202815-1-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 8bdf7b3f category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8bdf7b3fe1f48a2c1c212d4685903bba01409c0e -------------------------------- We should not reference the queue tagset in blk_mq_sched_tags_teardown() (see function comment) for the blk-mq flags, so use the passed flags instead. This solves a use-after-free, similarly fixed earlier (and since broken again) in commit f0c1c4d2 ("blk-mq: fix use-after-free in blk_mq_exit_sched"). Reported-by: NLinux Kernel Functional Testing <lkft@linaro.org> Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Tested-by: NAnders Roxell <anders.roxell@linaro.org> Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Signed-off-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1634890340-15432-1-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-mq-sched.c Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.17-rc1 commit fea9f92f category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fea9f92f1748083cb82049ed503be30c3d3a9b69 -------------------------------- Kashyap reports high CPU usage in blk_mq_queue_tag_busy_iter() and callees using megaraid SAS RAID card since moving to shared tags [0]. Previously, when shared tags was shared sbitmap, this function was less than optimum since we would iter through all tags for all hctx's, yet only ever match upto tagset depth number of rqs. Since the change to shared tags, things are even less efficient if we have parallel callers of blk_mq_queue_tag_busy_iter(). This is because in bt_iter() -> blk_mq_find_and_get_req() there would be more contention on accessing each request ref and tags->lock since they are now shared among all HW queues. Optimise by having separate calls to bt_for_each() for when we're using shared tags. In this case no longer pass a hctx, as it is no longer relevant, and teach bt_iter() about this. Ming suggested something along the lines of this change, apart from a different implementation. [0] https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reported-and-tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Link: https://lore.kernel.org/r/1638794990-137490-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-mq-tag.c Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit ae0f1a73 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ae0f1a732f4a5db284e2af02c305255734efd19c -------------------------------- Now that we use shared tags for shared sbitmap support, we don't require the tags sbitmap pointers, so drop them. This essentially reverts commit 222a5ae0 ("blk-mq: Use pointers for blk_mq_tags bitmap tags"). Function blk_mq_init_bitmap_tags() is removed also, since it would be only a wrappper for blk_mq_init_bitmaps(). Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1633429419-228500-14-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit e155b0c2 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e155b0c238b20f0a866f4334d292656665836c8a -------------------------------- Currently we use separate sbitmap pairs and active_queues atomic_t for shared sbitmap support. However a full sets of static requests are used per HW queue, which is quite wasteful, considering that the total number of requests usable at any given time across all HW queues is limited by the shared sbitmap depth. As such, it is considerably more memory efficient in the case of shared sbitmap to allocate a set of static rqs per tag set or request queue, and not per HW queue. So replace the sbitmap pairs and active_queues atomic_t with a shared tags per tagset and request queue, which will hold a set of shared static rqs. Since there is now no valid HW queue index to be passed to the blk_mq_ops .init and .exit_request callbacks, pass an invalid index token. This changes the semantics of the APIs, such that the callback would need to validate the HW queue index before using it. Currently no user of shared sbitmap actually uses the HW queue index (as would be expected). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-13-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: c6f9c0e2 ("blk-mq: allow hardware queue to get more tag while sharing a tag set") is merged, which will cause lots of conflicts for this patch, and in the meantime, the functionality will need to be adapted in that patch. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Nikolay Borisov 提交于
mainline inclusion from mainline-v5.13-rc1 commit 39aa56db category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39aa56db50b9ca5cad597e561b4b160b6cbbb65b -------------------------------- Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20210311081713.2763171-1-nborisov@suse.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 645db34e category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=645db34e50501aac141713fb47a315e5202ff890 -------------------------------- Refactor blk_mq_free_map_and_requests() such that it can be used at many sites at which the tag map and rqs are freed. Also rename to blk_mq_free_map_and_rqs(), which is shorter and matches the alloc equivalent. Suggested-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-12-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: commit a846a8e6 ("blk-mq: don't free tags if the tag_set is used by other device in queue initialztion") is already backported, blk_mq_free_map_and_rqs() is moved to __blk_mq_update_nr_hw_queues() instead of blk_mq_realloc_hw_ctxs(). Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 63064be1 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=63064be150e4b1ba1e4af594ef5aa81adf21a52d -------------------------------- Add a function to combine allocating tags and the associated requests, and factor out common patterns to use this new function. Some function only call blk_mq_alloc_map_and_rqs() now, but more functionality will be added later. Also make blk_mq_alloc_rq_map() and blk_mq_alloc_rqs() static since they are only used in blk-mq.c, and finally rename some functions for conciseness and consistency with other function names: - __blk_mq_alloc_map_and_{request -> rqs}() - blk_mq_alloc_{map_and_requests -> set_map_and_rqs}() Suggested-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-11-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit a7e7388d category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7e7388dced47a10ca13ae95ca975ea2830f196b -------------------------------- Put the functionality to update the sched shared sbitmap size in a common function. Since the same formula is always used to resize, and it can be got from the request queue argument, so just pass the request queue pointer. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-10-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 4f245d5b category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4f245d5bf0f7432c881e22a77066160a6cba8e03 -------------------------------- Function blk_mq_clear_rq_mapping() is required to clear the sched tags mappings in driver tags rqs[]. But there is no need for a driver tags to clear its own mapping, so skip clearing the mapping in this scenario. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-9-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit f32e4eaf category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f32e4eafaf29aa493bdea3123dac09ad135b599b -------------------------------- Function blk_mq_clear_rq_mapping() will be used for shared sbitmap tags in future, so pass a driver tags pointer instead of the tagset container and HW queue index. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-8-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 1820f4f0 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1820f4f0a5e75633ff384167027bf45ed1b10234 -------------------------------- To be more concise and consistent in naming, rename blk_mq_sched_free_requests() -> blk_mq_sched_free_rqs(). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-7-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit d99a6bb3 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d99a6bb337677d812d5bef7795c9fcf17f1ccebe -------------------------------- Function blk_mq_sched_alloc_tags() does same as __blk_mq_alloc_map_and_request(), so give a similar name to be consistent. Similarly rename label err_free_tags -> err_free_map_and_rqs. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-6-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.19-rc1 commit f6adcef5 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f6adcef5f317c78a899ca9766e90e47026834158 -------------------------------- It's easier to read: if (x) X; else Y; over: if (!x) Y; else X; No functional change intended. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-5-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 8fa04464 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8fa044640f128c0cd015f72cda42989a664889fc -------------------------------- For shared sbitmap, if the call to blk_mq_tag_update_depth() was successful for any hctx when hctx->sched_tags is not set, then it would be successful for all (due to nature in which blk_mq_tag_update_depth() fails). As such, there is no need to call blk_mq_tag_resize_shared_sbitmap() for each hctx. So relocate the call until after the hctx iteration under the !q->elevator check, which is equivalent (to !hctx->sched_tags). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit 65de57bb category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=65de57bb2e66f1fbede166c1307570ebd09eae83 -------------------------------- The original code in commit 24d2f903 ("blk-mq: split out tag initialization, support shared tags") would check tags->rqs is non-NULL and then dereference tags->rqs[]. Then in commit 2af8cbe3 ("blk-mq: split tag ->rqs[] into two"), we started to dereference tags->static_rqs[], but continued to check non-NULL tags->rqs. Check tags->static_rqs as non-NULL instead, which is more logical. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/1633429419-228500-2-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- This reverts commit 26eda960. The related feilds will be modified in later patches which are backported from mainline. Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5N162 CVE: NA -------------------------------- In the use of q_usage_counter of request_queue, blk_cleanup_queue using "wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))" to wait q_usage_counter becoming zero. however, if the q_usage_counter becoming zero quickly, and percpu_ref_exit will execute and ref->data will be freed, maybe another process will cause a null-defef problem like below: CPU0 CPU1 blk_cleanup_queue blk_freeze_queue blk_mq_freeze_queue_wait scsi_end_request percpu_ref_get ... percpu_ref_put atomic_long_sub_and_test percpu_ref_exit ref->data -> NULL ref->data->release(ref) -> null-deref Fix it by setting flag(QUEUE_FLAG_USAGE_COUNT_SYNC) to add synchronization mechanism, when ref->data->release is called, the flag will be setted, and the "wait_event" in blk_mq_freeze_queue_wait must wait flag becoming true as well, which will limit percpu_ref_exit to execute ahead of time. Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @x56Jason This PR is to enable IPI virtualization support for Intel SPR, meanwhile fix kabi changes. ## Intel-Kernel Issue #I5ODSC ## Test 1, build and boot succeed with CONFIG_SMP enabled or disabled. 2, Use IPI benchmark (https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com) to do unicast IPI testing ("Normal IPI" case in the benchmark): - With host disabled IPI virtualization, and guest enable x2apic mode, we can see a lot of MSR_WR_VMEXITs - With host enabled IPI virtualization, and guest enable x2apic mode, MSR_WR_VMEXITs caused by IPI disappears. ## Known Issue N/A ## Default Config Change N/A Link:https://gitee.com/openeuler/kernel/pulls/161 Reviewed-by: Chen Wei <chenwei@xfusion.com> Reviewed-by: Kevin Zhu <zhukeqian1@huawei.com> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
-
- 02 11月, 2022 15 次提交
-
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 6849ed81a33ae616bfadf40e5c68af265d106dff category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6849ed81a33ae616bfadf40e5c68af265d106dff -------------------------------- commit 68cf4f2a upstream. In order to further enable commit: bbe2df3f ("x86/alternative: Try inline spectre_v2=retpoline,amd") add the new GCC flag -mindirect-branch-cs-prefix: https://gcc.gnu.org/g:2196a681d7810ad8b227bf983f38ba716620545e https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952 https://bugs.llvm.org/show_bug.cgi?id=52323 to RETPOLINE=y builds. This should allow fully inlining retpoline,amd for GCC builds. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NKees Cook <keescook@chromium.org> Acked-by: NNick Desaulniers <ndesaulniers@google.com> Link: https://lkml.kernel.org/r/20211119165630.276205624@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Jiri Slaby 提交于
stable inclusion from stable-v5.10.133 commit ecc0d92a9f6cc3f74b67d2c9887d0c800018e661 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ecc0d92a9f6cc3f74b67d2c9887d0c800018e661 -------------------------------- commit 3131ef39 upstream. The build on x86_32 currently fails after commit 9bb2ec60 (objtool: Update Retpoline validation) with: arch/x86/kernel/../../x86/xen/xen-head.S:35: Error: no such instruction: `annotate_unret_safe' ANNOTATE_UNRET_SAFE is defined in nospec-branch.h. And head_32.S is missing this include. Fix this. Fixes: 9bb2ec60 ("objtool: Update Retpoline validation") Signed-off-by: NJiri Slaby <jslaby@suse.cz> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/63e23f80-033f-f64e-7522-2816debbc367@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Mikulas Patocka 提交于
stable inclusion from stable-v5.10.133 commit 236b959da9d145c311f9daa65e3fbbe1a3c9c9af category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=236b959da9d145c311f9daa65e3fbbe1a3c9c9af -------------------------------- commit 22682a07 upstream. Commit c087c6e7 ("objtool: Fix type of reloc::addend") failed to appreciate cross building from ILP32 hosts, where 'int' == 'long' and the issue persists. As such, use s64/int64_t/Elf64_Sxword for this field and suffer the pain that is ISO C99 printf formats for it. Fixes: c087c6e7 ("objtool: Fix type of reloc::addend") Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> [peterz: reword changelog, s/long long/s64/] Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2205161041260.11556@file01.intranet.prod.int.rdu2.redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit e1db6c8a69ec74aa0ecff19857895d10f39c6d98 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e1db6c8a69ec74aa0ecff19857895d10f39c6d98 -------------------------------- commit ead165fa upstream. Nathan reported objtool failing with the following messages: warning: objtool: no non-local symbols !? warning: objtool: gelf_update_symshndx: invalid section index The problem is due to commit 4abff6d4 ("objtool: Fix code relocs vs weak symbols") failing to consider the case where an object would have no non-local symbols. The problem that commit tries to address is adding a STB_LOCAL symbol to the symbol table in light of the ELF spec's requirement that: In each symbol table, all symbols with STB_LOCAL binding preced the weak and global symbols. As ``Sections'' above describes, a symbol table section's sh_info section header member holds the symbol table index for the first non-local symbol. The approach taken is to find this first non-local symbol, move that to the end and then re-use the freed spot to insert a new local symbol and increment sh_info. Except it never considered the case of object files without global symbols and got a whole bunch of details wrong -- so many in fact that it is a wonder it ever worked :/ Specifically: - It failed to re-hash the symbol on the new index, so a subsequent find_symbol_by_index() would not find it at the new location and a query for the old location would now return a non-deterministic choice between the old and new symbol. - It failed to appreciate that the GElf wrappers are not a valid disk format (it works because GElf is basically Elf64 and we only support x86_64 atm.) - It failed to fully appreciate how horrible the libelf API really is and got the gelf_update_symshndx() call pretty much completely wrong; with the direct consequence that if inserting a second STB_LOCAL symbol would require moving the same STB_GLOBAL symbol again it would completely come unstuck. Write a new elf_update_symbol() function that wraps all the magic required to update or create a new symbol at a given index. Specifically, gelf_update_sym*() require an @ndx argument that is relative to the @data argument; this means you have to manually iterate the section data descriptor list and update @ndx. Fixes: 4abff6d4 ("objtool: Fix code relocs vs weak symbols") Reported-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@kernel.org> Tested-by: NNathan Chancellor <nathan@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/YoPCTEYjoPqE4ZxB@hirez.programming.kicks-ass.netSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 5.10: elf_hash_add() takes a hash table pointer, not just a name] Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 3e8afd072d098958a507fb10251a201c9899150c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3e8afd072d098958a507fb10251a201c9899150c -------------------------------- commit c087c6e7 upstream. Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means that our reloc::addend needs to be long or face tuncation issues when we do elf_rebuild_reloc_section(): - 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt+0x80000067 + 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt-0x7fffff99 Fixes: 627fce14 ("objtool: Add ORC unwind table generation") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.596871927@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 42ec4d71353f4c6da1d94baf64acbb1badc16b11 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=42ec4d71353f4c6da1d94baf64acbb1badc16b11 -------------------------------- commit 4abff6d4 upstream. Occasionally objtool driven code patching (think .static_call_sites .retpoline_sites etc..) goes sideways and it tries to patch an instruction that doesn't match. Much head-scatching and cursing later the problem is as outlined below and affects every section that objtool generates for us, very much including the ORC data. The below uses .static_call_sites because it's convenient for demonstration purposes, but as mentioned the ORC sections, .retpoline_sites and __mount_loc are all similarly affected. Consider: foo-weak.c: extern void __SCT__foo(void); __attribute__((weak)) void foo(void) { return __SCT__foo(); } foo.c: extern void __SCT__foo(void); extern void my_foo(void); void foo(void) { my_foo(); return __SCT__foo(); } These generate the obvious code (gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c): foo-weak.o: 0000000000000000 <foo>: 0: e9 00 00 00 00 jmpq 5 <foo+0x5> 1: R_X86_64_PLT32 __SCT__foo-0x4 foo.o: 0000000000000000 <foo>: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <foo+0x9> 5: R_X86_64_PLT32 my_foo-0x4 9: 48 83 c4 08 add $0x8,%rsp d: e9 00 00 00 00 jmpq 12 <foo+0x12> e: R_X86_64_PLT32 __SCT__foo-0x4 Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o): foos.o: 0000000000000000 <foo-0x10>: 0: e9 00 00 00 00 jmpq 5 <foo-0xb> 1: R_X86_64_PLT32 __SCT__foo-0x4 5: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1) f: 90 nop 0000000000000010 <foo>: 10: 48 83 ec 08 sub $0x8,%rsp 14: e8 00 00 00 00 callq 19 <foo+0x9> 15: R_X86_64_PLT32 my_foo-0x4 19: 48 83 c4 08 add $0x8,%rsp 1d: e9 00 00 00 00 jmpq 22 <foo+0x12> 1e: R_X86_64_PLT32 __SCT__foo-0x4 Noting that ld preserves the weak function text, but strips the symbol off of it (hence objdump doing that funny negative offset thing). This does lead to 'interesting' unused code issues with objtool when ran on linked objects, but that seems to be working (fingers crossed). So far so good.. Now lets consider the objtool static_call output section (readelf output, old binutils): foo-weak.o: Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foo.o: Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + d 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foos.o: Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 1d 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 So we have two patch sites, one in the dead code of the weak foo and one in the real foo. All is well. *HOWEVER*, when the toolchain strips unused section symbols it generates things like this (using new enough binutils): foo-weak.o: Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foo.o: Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + d 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foos.o: Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 foo + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 foo + d 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 And now we can see how that foos.o .static_call_sites goes side-ways, we now have _two_ patch sites in foo. One for the weak symbol at foo+0 (which is no longer a static_call site!) and one at foo+d which is in fact the right location. This seems to happen when objtool cannot find a section symbol, in which case it falls back to any other symbol to key off of, however in this case that goes terribly wrong! As such, teach objtool to create a section symbol when there isn't one. Fixes: 44f6a7c0 ("objtool: Fix seg fault with Clang non-section symbols") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 38a80a3ca2cb069dd5608703b015a206a672aae5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=38a80a3ca2cb069dd5608703b015a206a672aae5 -------------------------------- commit d4b5a5c9 upstream. Make sure we can see the text changes when booting with 'debug-alternative'. Example output: [ ] SMP alternatives: retpoline at: __traceiter_initcall_level+0x1f/0x30 (ffffffff8100066f) len: 5 to: __x86_indirect_thunk_rax+0x0/0x20 [ ] SMP alternatives: ffffffff82603e58: [2:5) optimized NOPs: ff d0 0f 1f 00 [ ] SMP alternatives: ffffffff8100066f: orig: e8 cc 30 00 01 [ ] SMP alternatives: ffffffff8100066f: repl: ff d0 0f 1f 00 Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.422273830@infradead.orgSigned-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 3d13ee0d411a078ca1538d823c2c759b8b266fb1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3d13ee0d411a078ca1538d823c2c759b8b266fb1 -------------------------------- commit bbe2df3f upstream. Try and replace retpoline thunk calls with: LFENCE CALL *%\reg for spectre_v2=retpoline,amd. Specifically, the sequence above is 5 bytes for the low 8 registers, but 6 bytes for the high 8 registers. This means that unless the compilers prefix stuff the call with higher registers this replacement will fail. Luckily GCC strongly favours RAX for the indirect calls and most (95%+ for defconfig-x86_64) will be converted. OTOH clang strongly favours R11 and almost nothing gets converted. Note: it will also generate a correct replacement for the Jcc.d32 case, except unless the compilers start to prefix stuff that, it'll never fit. Specifically: Jncc.d8 1f LFENCE JMP *%\reg 1: is 7-8 bytes long, where the original instruction in unpadded form is only 6 bytes. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.359986601@infradead.org [cascardo: RETPOLINE_AMD was renamed to RETPOLINE_LFENCE] Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit b0e2dc950654162bc68cec530156251e7ad3f03a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b0e2dc950654162bc68cec530156251e7ad3f03a -------------------------------- commit 2f0cbb2a upstream. Handle the rare cases where the compiler (clang) does an indirect conditional tail-call using: Jcc __x86_indirect_thunk_\reg For the !RETPOLINE case this can be rewritten to fit the original (6 byte) instruction like: Jncc.d8 1f JMP *%\reg NOP 1: Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.296470217@infradead.orgSigned-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Borislav Petkov 提交于
stable inclusion from stable-v5.10.133 commit e6f8dc86a1c15b862486a61abcb54b88e8c177e3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e6f8dc86a1c15b862486a61abcb54b88e8c177e3 -------------------------------- commit 6e8c83d2 upstream. Now that the different instruction-inspecting functions return a value, test that and return early from callers if error has been encountered. While at it, do not call insn_get_modrm() when calling insn_get_displacement() because latter will make sure to call insn_get_modrm() if ModRM hasn't been parsed yet. Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210304174237.31945-6-bp@alien8.deSigned-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Juergen Gross 提交于
stable inclusion from stable-v5.10.132 commit 06a5dc3911a3b29acefd53470bdeccb88deb155e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=06a5dc3911a3b29acefd53470bdeccb88deb155e -------------------------------- commit 230ec83d upstream. x86_has_pat_wp() is using a wrong test, as it relies on the normal PAT configuration used by the kernel. In case the PAT MSR has been setup by another entity (e.g. Xen hypervisor) it might return false even if the PAT configuration is allowing WP mappings. This due to the fact that when running as Xen PV guest the PAT MSR is setup by the hypervisor and cannot be changed by the guest. This results in the WP related entry to be at a different position when running as Xen PV guest compared to the bare metal or fully virtualized case. The correct way to test for WP support is: 1. Get the PTE protection bits needed to select WP mode by reading __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR setting this might return protection bits for a stronger mode, e.g. UC-) 2. Translate those bits back into the real cache mode selected by those PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)] 3. Test for the cache mode to be _PAGE_CACHE_MODE_WP Fixes: f88a68fa ("x86/mm: Extend early_memremap() support with additional attrs") Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.14 Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ilpo Järvinen 提交于
stable inclusion from stable-v5.10.132 commit d9cb6fabc90102f9e61fe35bd0160db88f4f53b4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d9cb6fabc90102f9e61fe35bd0160db88f4f53b4 -------------------------------- commit f9b11229 upstream. When console is enabled, univ8250_console_setup() calls serial8250_console_setup() before .dev is set to uart_port. Therefore, it will not call pm_runtime_get_sync(). Later, when the actual driver is going to take over univ8250_console_exit() is called. As .dev is already set, serial8250_console_exit() makes pm_runtime_put_sync() call with usage count being zero triggering PM usage count warning (extra debug for univ8250_console_setup(), univ8250_console_exit(), and serial8250_register_ports()): [ 0.068987] univ8250_console_setup ttyS0 nodev [ 0.499670] printk: console [ttyS0] enabled [ 0.717955] printk: console [ttyS0] printing thread started [ 1.960163] serial8250_register_ports assigned dev for ttyS0 [ 1.976830] printk: console [ttyS0] disabled [ 1.976888] printk: console [ttyS0] printing thread stopped [ 1.977073] univ8250_console_exit ttyS0 usage:0 [ 1.977075] serial8250 serial8250: Runtime PM usage count underflow! [ 1.977429] dw-apb-uart.6: ttyS0 at MMIO 0x4010006000 (irq = 33, base_baud = 115200) is a 16550A [ 1.977812] univ8250_console_setup ttyS0 usage:2 [ 1.978167] printk: console [ttyS0] printing thread started [ 1.978203] printk: console [ttyS0] enabled To fix the issue, call pm_runtime_get_sync() in serial8250_register_ports() as soon as .dev is set for an uart_port if it has console enabled. This problem became apparent only recently because 82586a72 ("PM: runtime: Avoid device usage count underflows") added the warning printout. I confirmed this problem also occurs with v5.18 (w/o the warning printout, obviously). Fixes: bedb404e ("serial: 8250_port: Don't use power management for kernel console") Cc: stable <stable@kernel.org> Tested-by: NTony Lindgren <tony@atomide.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: NTony Lindgren <tony@atomide.com> Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/b4f428e9-491f-daf2-2232-819928dc276e@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ilpo Järvinen 提交于
stable inclusion from stable-v5.10.132 commit e1bd94dd9e5c63ac7280d67e730061cf684125ba category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e1bd94dd9e5c63ac7280d67e730061cf684125ba -------------------------------- commit 211565b1 upstream. The driver must provide throttle and unthrottle in uart_ops when it sets UPSTAT_AUTORTS. Add them using existing stop_rx & enable_interrupts functions. Fixes: 2a76fa28 (serial: pl011: Adopt generic flag to store auto RTS status) Cc: stable <stable@kernel.org> Cc: Lukas Wunner <lukas@wunner.de> Reported-by: NNuno Gonçalves <nunojpg@gmail.com> Tested-by: NNuno Gonçalves <nunojpg@gmail.com> Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20220614075637.8558-1-ilpo.jarvinen@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Ilpo Järvinen 提交于
stable inclusion from stable-v5.10.132 commit b8c4661126568b8cd3e18908b920c3f83eba28e1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b8c4661126568b8cd3e18908b920c3f83eba28e1 -------------------------------- commit 5c5f44e3 upstream. The code lacks clearing of previous DEAT/DEDT values. Thus, changing values on the fly results in garbage delays tending towards the maximum value as more and more bits are ORed together. (Leaving RS485 mode would have cleared the old values though). Fixes: 1bcda09d ("serial: stm32: add support for RS485 hardware control mode") Cc: stable <stable@kernel.org> Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20220627150753.34510-1-ilpo.jarvinen@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Yi Yang 提交于
stable inclusion from stable-v5.10.132 commit 039ffe436ae55dabfaa4f46be39a7214c910cacc category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=039ffe436ae55dabfaa4f46be39a7214c910cacc -------------------------------- commit 6e690d54 upstream. If port->mapbase = NULL in serial8250_request_std_resource() , it need return a error code instead of 0. If uart_set_info() fail to request new regions by serial8250_request_std_resource() but the return value of serial8250_request_std_resource() is 0, The system incorrectly considers that the resource application is successful and does not attempt to restore the old setting. A null pointer reference is triggered when the port resource is later invoked. Signed-off-by: NYi Yang <yiyang13@huawei.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20220628083515.64138-1-yiyang13@huawei.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-