- 11 10月, 2022 12 次提交
-
-
由 Hao Lan 提交于
mainline inclusion from mainline-v6.0-rc2 commit 5c4f7284 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5QIIF CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5c4f72842d1d ---------------------------------------------------------------------- This patch supports llrs fec mode in speed 200G for some new devices, and suppoprts querying llrs fec ability from firmware. Signed-off-by: NHao Lan <lanhao@huawei.com> Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJiantao Xiao <xiaojiantao1@h-partners.com> Reviewed-by: NJian Shen <shenjian15@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guangbin Huang 提交于
mainline inclusion from mainline-v6.0-rc2 commit eaf83ae5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5QIIF CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eaf83ae59e18 ---------------------------------------------------------------------- For some new devices, driver can queries fec ability from firmware to decide which FEC mode can be supported. If devices of old version which not support querying fec ability, driver sets fixed ability according to current speed. Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJiantao Xiao <xiaojiantao1@h-partners.com> Reviewed-by: NJian Shen <shenjian15@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guangbin Huang 提交于
mainline inclusion from mainline-v6.0-rc2 commit 507e46ae category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5QIIF CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=507e46ae26ea ---------------------------------------------------------------------- As some new devices may not support GRO offload and flow table director, to support these devices, driver needs to querying capabilities of GRO offload and flow table director from firmware. Whether the driver supports these two features depends on capabilities. For old device of version HNAE3_DEVICE_VERSION_V2, driver sets their capabilities of these two features to fixed value. Setting default features of netdev and debugfs also need to identify whether support these two features. Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJiantao Xiao <xiaojiantao1@h-partners.com> Reviewed-by: NJian Shen <shenjian15@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wenpeng Liang 提交于
mainline inclusion from mainline-for-next commit 3d67e7e2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5TAQ5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=3d67e7e236ad ---------------------------------------------------------------------- The MR raw restrack attributes come from the queue context maintained by the ROCEE. For example: $ rdma res show mr dev hns_0 mrn 6 -dd -jp -r [ { "ifindex": 4, "ifname": "hns_0", "data": [ 1,0,0,0,2,0,0,0,0,3,0,0,0,0,2,0,0,0,0,0,32,0,0,0,2,0,0,0, 2,0,0,0,0,0,0,0 ] } ] Link: https://lore.kernel.org/r/20220822104455.2311053-8-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com> Reviewed-by: NYangyang Li <liyangyang20@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wenpeng Liang 提交于
mainline inclusion from mainline-for-next commit dc9981ef category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5TAQ5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=dc9981ef17c6 ---------------------------------------------------------------------- The MR restrack attributes come from the queue information maintained by the driver. For example: $ rdma res show mr dev hns_0 mrn 6 -dd -jp [ { "ifindex": 4, "ifname": "hns_0", "mrn": 6, "rkey": "300", "lkey": "300", "mrlen": 131072, "pdn": 8, "pid": 1524, "comm": "ib_send_bw" }, "drv_pbl_hop_num": 2, "drv_ba_pg_shift": 14, "drv_buf_pg_shift": 12 } Link: https://lore.kernel.org/r/20220822104455.2311053-7-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com> Reviewed-by: NYangyang Li <liyangyang20@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wenpeng Liang 提交于
mainline inclusion from mainline-for-next commit 3e89d78b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5TAQ5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=3e89d78b21a8 ---------------------------------------------------------------------- The QP raw restrack attributes come from the queue context maintained by the ROCEE. For example: $ rdma res show qp link hns_0 -jp -dd -r [ { "ifindex": 4, "ifname": "hns_0", "data": [ 2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0, 5,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,255,156,0,0,63,156,0,0, 7,0,0,0,1,0,0,0,9,0,0,0,0,0,0,0,2,0,0,0,2,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,63,156,0, 0,0,0,0,0 ] } ] Link: https://lore.kernel.org/r/20220822104455.2311053-6-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com> Reviewed-by: NYangyang Li <liyangyang20@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wenpeng Liang 提交于
mainline inclusion from mainline-for-next commit e198d65d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5TAQ5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=e198d65d76e9 ---------------------------------------------------------------------- The QP restrack attributes come from the queue information maintained by the driver. For example: $ rdma res show qp link hns_0 lqpn 41 -jp -dd [ { "ifindex": 4, "ifname": "hns_0", "port": 1, "lqpn": 41, "rqpn": 40, "type": "RC", "state": "RTR", "rq-psn": 12474738, "sq-psn": 0, "path-mig-state": "ARMED", "pdn": 9, "pid": 1523, "comm": "ib_send_bw" }, "drv_sq_wqe_cnt": 128, "drv_sq_max_gs": 1, "drv_rq_wqe_cnt": 512, "drv_rq_max_gs": 2, "drv_ext_sge_sge_cnt": 0 } Link: https://lore.kernel.org/r/20220822104455.2311053-5-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com> Reviewed-by: NYangyang Li <liyangyang20@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wenpeng Liang 提交于
mainline inclusion from mainline-for-next commit f2b070f3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5TAQ5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=f2b070f36d1b ---------------------------------------------------------------------- The CQ raw restrack attributes come from the queue context maintained by the ROCEE. For example: $ rdma res show cq dev hns_0 cqn 14 -dd -jp -r [ { "ifindex": 4, "ifname": "hns_0", "data": [ 1,0,0,0,7,0,0,0,0,0,0,0,0,82,6,0,0,82,6,0,0,82,6,0, 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0, 6,0,0,0,0,0,0,0 ] } ] Link: https://lore.kernel.org/r/20220822104455.2311053-4-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com> Reviewed-by: NYangyang Li <liyangyang20@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wenpeng Liang 提交于
mainline inclusion from mainline-for-next commit eb00b9a0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5TAQ5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=eb00b9a08b9d ---------------------------------------------------------------------- Remove the resttrack attributes from the queue context held by ROCEE, and add the resttrack attributes from the queue information maintained by the driver. For example: $ rdma res show cq dev hns_0 cqn 14 -dd -jp [ { "ifindex": 4, "ifname": "hns_0", "cqn": 14, "cqe": 127, "users": 1, "adaptive-moderation": false, "ctxn": 8, "pid": 1524, "comm": "ib_send_bw" }, "drv_cq_depth": 128, "drv_cons_index": 0, "drv_cqe_size": 32, "drv_arm_sn": 1 } Link: https://lore.kernel.org/r/20220822104455.2311053-3-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com> Reviewed-by: NYangyang Li <liyangyang20@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wenpeng Liang 提交于
mainline inclusion from mainline-for-next commit 40b4b79c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5TAQ5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=40b4b79c866f ---------------------------------------------------------------------- There is no need to use a dedicated DXF file and DFX structure to manage the interface of the query queue context. Link: https://lore.kernel.org/r/20220822104455.2311053-2-liangwenpeng@huawei.comSigned-off-by: NWenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: NLeon Romanovsky <leon@kernel.org> Signed-off-by: NZhengfeng Luo <luozhengfeng@h-partners.com> Reviewed-by: NYangyang Li <liyangyang20@huawei.com> Reviewed-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @leoliu-oc Add support for more Zhaoxin processors. And improve the uncore code to provide more functions and support. ### Issue https://gitee.com/openeuler/kernel/issues/I5SRF7 ### Test `perf list`will display more infomations. ### Knowe Issue N/A ### Default config change N/A Link:https://gitee.com/openeuler/kernel/pulls/129 Reviewed-by: Jiao Fenfang <jiaofenfang@uniontech.com> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @xin3liang Enable NVMe over TCP for arm64. Link:https://gitee.com/openeuler/kernel/pulls/156 Reviewed-by: Liu Chao <liuchao173@huawei.com> Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
-
- 08 10月, 2022 1 次提交
-
-
由 Xinliang Liu 提交于
openEuler inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5SBW1 CVE: NA -------------------------------- Enable NVMe over TCP for arm64. Signed-off-by: NXinliang Liu <xinliang.liu@linaro.org>
-
- 30 9月, 2022 27 次提交
-
-
由 Lin Shengwang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA -------------------------------------------------------------------------- Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lin Shengwang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA -------------------------------------------------------------------------- Due to the conflict between Core Scheduling and SMT expeller, so SCHED_CORE and QOS_SCHED_SMT_EXPELLER are mutually exclusive. Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Cruz Zhao 提交于
mainline inclusion from mainline-v6.0-rc1 commit 91caa5ae category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=91caa5ae242465c3ab9fd473e50170faa7e944f4 -------------------------------------------------------------------------- In function sched_core_update_cookie(), a task will enqueue into the core tree only when it enqueued before, that is, if an uncookied task is cookied, it will not enqueue into the core tree until it enqueue again, which will result in unnecessary force idle. Here follows the scenario: CPU x and CPU y are a pair of SMT siblings. 1. Start task a running on CPU x without sleeping, and task b and task c running on CPU y without sleeping. 2. We create a cookie and share it to task a and task b, and then we create another cookie and share it to task c. 3. Simpling core_forceidle_sum of task a and b from /proc/PID/sched And we will find out that core_forceidle_sum of task a takes 30% time of the sampling period, which shouldn't happen as task a and b have the same cookie. Then we migrate task a to CPU x', migrate task b and c to CPU y', where CPU x' and CPU y' are a pair of SMT siblings, and sampling again, we will found out that core_forceidle_sum of task a and b are almost zero. To solve this problem, we enqueue the task into the core tree if it's on rq. Fixes: 6e33cad0("sched: Trivial core scheduling cookie management") Signed-off-by: NCruz Zhao <CruzZhao@linux.alibaba.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1656403045-100840-2-git-send-email-CruzZhao@linux.alibaba.com Conflicts: kernel/sched/core_sched.c [Feature 4feee7d1("sched/core: Forced idle accounting") is not applied.] Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Hao Jia 提交于
mainline inclusion from mainline-v5.19-rc1 commit 2679a837 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2679a83731d51a744657f718fc02c3b077e47562 -------------------------------------------------------------------------- When we use raw_spin_rq_lock() to acquire the rq lock and have to update the rq clock while holding the lock, the kernel may issue a WARN_DOUBLE_CLOCK warning. Since we directly use raw_spin_rq_lock() to acquire rq lock instead of rq_lock(), there is no corresponding change to rq->clock_update_flags. In particular, we have obtained the rq lock of other CPUs, the rq->clock_update_flags of this CPU may be RQCF_UPDATED at this time, and then calling update_rq_clock() will trigger the WARN_DOUBLE_CLOCK warning. So we need to clear RQCF_UPDATED of rq->clock_update_flags to avoid the WARN_DOUBLE_CLOCK warning. For the sched_rt_period_timer() and migrate_task_rq_dl() cases we simply replace raw_spin_rq_lock()/raw_spin_rq_unlock() with rq_lock()/rq_unlock(). For the {pull,push}_{rt,dl}_task() cases, we add the double_rq_clock_clear_update() function to clear RQCF_UPDATED of rq->clock_update_flags, and call double_rq_clock_clear_update() before double_lock_balance()/double_rq_lock() returns to avoid the WARN_DOUBLE_CLOCK warning. Some call trace reports: Call Trace 1: <IRQ> sched_rt_period_timer+0x10f/0x3a0 ? enqueue_top_rt_rq+0x110/0x110 __hrtimer_run_queues+0x1a9/0x490 hrtimer_interrupt+0x10b/0x240 __sysvec_apic_timer_interrupt+0x8a/0x250 sysvec_apic_timer_interrupt+0x9a/0xd0 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x12/0x20 Call Trace 2: <TASK> activate_task+0x8b/0x110 push_rt_task.part.108+0x241/0x2c0 push_rt_tasks+0x15/0x30 finish_task_switch+0xaa/0x2e0 ? __switch_to+0x134/0x420 __schedule+0x343/0x8e0 ? hrtimer_start_range_ns+0x101/0x340 schedule+0x4e/0xb0 do_nanosleep+0x8e/0x160 hrtimer_nanosleep+0x89/0x120 ? hrtimer_init_sleeper+0x90/0x90 __x64_sys_nanosleep+0x96/0xd0 do_syscall_64+0x34/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Call Trace 3: <TASK> deactivate_task+0x93/0xe0 pull_rt_task+0x33e/0x400 balance_rt+0x7e/0x90 __schedule+0x62f/0x8e0 do_task_dead+0x3f/0x50 do_exit+0x7b8/0xbb0 do_group_exit+0x2d/0x90 get_signal+0x9df/0x9e0 ? preempt_count_add+0x56/0xa0 ? __remove_hrtimer+0x35/0x70 arch_do_signal_or_restart+0x36/0x720 ? nanosleep_copyout+0x39/0x50 ? do_nanosleep+0x131/0x160 ? audit_filter_inodes+0xf5/0x120 exit_to_user_mode_prepare+0x10f/0x1e0 syscall_exit_to_user_mode+0x17/0x30 do_syscall_64+0x40/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Call Trace 4: update_rq_clock+0x128/0x1a0 migrate_task_rq_dl+0xec/0x310 set_task_cpu+0x84/0x1e4 try_to_wake_up+0x1d8/0x5c0 wake_up_process+0x1c/0x30 hrtimer_wakeup+0x24/0x3c __hrtimer_run_queues+0x114/0x270 hrtimer_interrupt+0xe8/0x244 arch_timer_handler_phys+0x30/0x50 handle_percpu_devid_irq+0x88/0x140 generic_handle_domain_irq+0x40/0x60 gic_handle_irq+0x48/0xe0 call_on_irq_stack+0x2c/0x60 do_interrupt_handler+0x80/0x84 Steps to reproduce: 1. Enable CONFIG_SCHED_DEBUG when compiling the kernel 2. echo 1 > /sys/kernel/debug/clear_warn_once echo "WARN_DOUBLE_CLOCK" > /sys/kernel/debug/sched/features echo "NO_RT_PUSH_IPI" > /sys/kernel/debug/sched/features 3. Run some rt/dl tasks that periodically work and sleep, e.g. Create 2*n rt or dl (90% running) tasks via rt-app (on a system with n CPUs), and Dietmar Eggemann reports Call Trace 4 when running on PREEMPT_RT kernel. Signed-off-by: NHao Jia <jiahao.os@bytedance.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20220430085843.62939-2-jiahao.os@bytedance.comSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Phil Auld 提交于
mainline inclusion from mainline-v5.18-rc2 commit 5524cbb1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5524cbb1bfcdff0cad0aaa9f94e6092002a07259 -------------------------------------------------------------------------- Arm64 systems rely on store_cpu_topology() to call update_siblings_masks() to transfer the toplogy to the various cpu masks. This needs to be done before the call to notify_cpu_starting() which tells the scheduler about each cpu found, otherwise the core scheduling data structures are setup in a way that does not match the actual topology. With smt_mask not setup correctly we bail on `cpumask_weight(smt_mask) == 1` for !leaders in: notify_cpu_starting() cpuhp_invoke_callback_range() sched_cpu_starting() sched_core_cpu_starting() which leads to rq->core not being correctly set for !leader-rq's. Without this change stress-ng (which enables core scheduling in its prctl tests in newer versions -- i.e. with PR_SCHED_CORE support) causes a warning and then a crash (trimmed for legibility): [ 1853.805168] ------------[ cut here ]------------ [ 1853.809784] task_rq(b)->core != rq->core [ 1853.809792] WARNING: CPU: 117 PID: 0 at kernel/sched/fair.c:11102 cfs_prio_less+0x1b4/0x1c4 ... [ 1854.015210] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 ... [ 1854.231256] Call trace: [ 1854.233689] pick_next_task+0x3dc/0x81c [ 1854.237512] __schedule+0x10c/0x4cc [ 1854.240988] schedule_idle+0x34/0x54 Fixes: 9edeaea1 ("sched: Core-wide rq->lock") Signed-off-by: NPhil Auld <pauld@redhat.com> Reviewed-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20220331153926.25742-1-pauld@redhat.comSigned-off-by: NWill Deacon <will@kernel.org> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.18-rc2 commit 386ef214 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=386ef214c3c6ab111d05e1790e79475363abaa05 -------------------------------------------------------------------------- try_steal_cookie() looks at task_struct::cpus_mask to decide if the task could be moved to `this' CPU. It ignores that the task might be in a migration disabled section while not on the CPU. In this case the task must not be moved otherwise per-CPU assumption are broken. Use is_cpu_allowed(), as suggested by Peter Zijlstra, to decide if the a task can be moved. Fixes: d2dfa17b ("sched: Trivial forced-newidle balancer") Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/YjNK9El+3fzGmswf@linutronix.deSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.18-rc2 commit 5b6547ed category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b6547ed97f4f5dfc23f8e3970af6d11d7b7ed7e -------------------------------------------------------------------------- Steve reported that ChromeOS encounters the forceidle balancer being ran from rt_mutex_setprio()'s balance_callback() invocation and explodes. Now, the forceidle balancer gets queued every time the idle task gets selected, set_next_task(), which is strictly too often. rt_mutex_setprio() also uses set_next_task() in the 'change' pattern: queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */ running = task_current(rq, p); /* rq->curr == p */ if (queued) dequeue_task(...); if (running) put_prev_task(...); /* change task properties */ if (queued) enqueue_task(...); if (running) set_next_task(...); However, rt_mutex_setprio() will explicitly not run this pattern on the idle task (since priority boosting the idle task is quite insane). Most other 'change' pattern users are pidhash based and would also not apply to idle. Also, the change pattern doesn't contain a __balance_callback() invocation and hence we could have an out-of-band balance-callback, which *should* trigger the WARN in rq_pin_lock() (which guards against this exact anti-pattern). So while none of that explains how this happens, it does indicate that having it in set_next_task() might not be the most robust option. Instead, explicitly queue the forceidle balancer from pick_next_task() when it does indeed result in forceidle selection. Having it here, ensures it can only be triggered under the __schedule() rq->lock instance, and hence must be ran from that context. This also happens to clean up the code a little, so win-win. Fixes: d2dfa17b ("sched: Trivial forced-newidle balancer") Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NT.J. Alumbaugh <talumbau@chromium.org> Link: https://lkml.kernel.org/r/20220330160535.GN8939@worktop.programming.kicks-ass.netSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shaokun Zhang 提交于
mainline inclusion from mainline-v5.16-rc1 commit d07b2eee category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d07b2eee4501c393cbf5bfcad36143310cfd72f9 -------------------------------------------------------------------------- Make cookie functions static as these are no longer invoked directly by other code. No functional change intended. Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210922085735.52812-1-zhangshaokun@hisilicon.comSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Li Zhijian 提交于
mainline inclusion from mainline-v5.16-rc1 commit 1c36432b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1c36432b278cecf1499f21fae19836e614954309 -------------------------------------------------------------------------- Previously, 'make -C sched run_tests' will block forever when it occurs something wrong where the *selftests framework* is waiting for its child processes to exit. [root@iaas-rpma sched]# ./cs_prctl_test ## Create a thread/process/process group hiearchy Not a core sched system tid=74985, / tgid=74985 / pgid=74985: ffffffffffffffff Not a core sched system tid=74986, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74988, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74989, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74990, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74987, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74991, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74992, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74993, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system (268) FAILED: get_cs_cookie(0) == 0 ## Set a cookie on entire process group -1 = prctl(62, 1, 0, 2, 0) core_sched create failed -- PGID: Invalid argument (cs_prctl_test.c:272) - [root@iaas-rpma sched]# ps PID TTY TIME CMD 4605 pts/2 00:00:00 bash 74986 pts/2 00:00:00 cs_prctl_test 74987 pts/2 00:00:00 cs_prctl_test 74999 pts/2 00:00:00 ps Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NChris Hyser <chris.hyser@oracle.com> Link: https://lore.kernel.org/r/20210902024333.75983-1-lizhijian@cn.fujitsu.comSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Eugene Syromiatnikov 提交于
mainline inclusion from mainline-v5.16-rc1 commit 61bc346c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=61bc346ce64a3864ac55f5d18bdc1572cda4fb18 -------------------------------------------------------------------------- Commit 7ac592aa ("sched: prctl() core-scheduling interface") made use of enum pid_type in prctl's arg4; this type and the associated enumeration definitions are not exposed to userspace. Christian has suggested to provide additional macro definitions that convey the meaning of the type argument more in alignment with its actual usage, and this patch does exactly that. Link: https://lore.kernel.org/r/20210825170613.GA3884@asgard.redhat.comSuggested-by: NChristian Brauner <christian.brauner@ubuntu.com> Acked-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NEugene Syromiatnikov <esyr@redhat.com> Complements: 7ac592aa ("sched: prctl() core-scheduling interface") Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.16-rc1 commit bc9ffef3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bc9ffef31bf59819c9fc032178534ff9ed7c4981 -------------------------------------------------------------------------- Tao suggested a two-pass task selection to avoid the retry loop. Not only does it avoid the retry loop, it results in *much* simpler code. This also fixes an issue spotted by Josh Don where, for SMT3+, we can forget to update max on the first pass and get to do an extra round. Suggested-by: NTao Zhou <tao.zhou@linux.dev> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NJosh Don <joshdon@google.com> Reviewed-by: NVineeth Pillai (Microsoft) <vineethrp@gmail.com> Link: https://lkml.kernel.org/r/YSS9+k1teA9oPEKl@hirez.programming.kicks-ass.netSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14 commit 3c474b32 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c474b3239f12fe0b00d7e82481f36a1f31e79ab -------------------------------------------------------------------------- Eugene tripped over the case where rq_lock(), as called in a for_each_possible_cpu() loop came apart because rq->core hadn't been setup yet. This is a somewhat unusual, but valid case. Rework things such that rq->core is initialized to point at itself. IOW initialize each CPU as a single threaded Core. CPU online will then join the new CPU (thread) to an existing Core where needed. For completeness sake, have CPU offline fully undo the state so as to not presume the topology will match the next time it comes online. Fixes: 9edeaea1 ("sched: Core-wide rq->lock") Reported-by: NEugene Syromiatnikov <esyr@redhat.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NJosh Don <joshdon@google.com> Tested-by: NEugene Syromiatnikov <esyr@redhat.com> Link: https://lkml.kernel.org/r/YR473ZGeKqMs6kw+@hirez.programming.kicks-ass.net Conflicts: kernel/sched/core.c [Bugfix ed3cd45f("Merge tag 'v5.11' into sched/core, to pick up fixes & refresh the branch") is not applied.] Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Fabio M. De Francesco 提交于
mainline inclusion from mainline-v5.15-rc1 commit ce48ee81 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce48ee81a1930b2218bea23490adb6673c88bf70 -------------------------------------------------------------------------- Rephrase the "For MDS" section in core-scheduling.rst for the purpose of making it clearer what is meant by "kernel memory is still considered untrusted". Suggested-by: NVineeth Pillai <Vineeth.Pillai@microsoft.com> Signed-off-by: NFabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by: NJoel Fernandes (Google) <joelaf@google.com> Link: https://lore.kernel.org/r/20210721190250.26095-1-fmdefrancesco@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ingo Molnar 提交于
mainline inclusion from mainline-v5.14-rc1 commit d2343cb8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d2343cb8d154fe20c4499711bb3a9af2095b2b4b -------------------------------------------------------------------------- This option at minimum adds extra code to the scheduler - even if it's default unused - and most users wouldn't want it. Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Joel Fernandes (Google) 提交于
mainline inclusion from mainline-v5.14-rc1 commit 0159bb02 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0159bb020ca9a43b17aa9149f1199643c1d49426 -------------------------------------------------------------------------- Now that core scheduling is merged, update the documentation. Co-developed-by: NChris Hyser <chris.hyser@oracle.com> Signed-off-by: NChris Hyser <chris.hyser@oracle.com> Co-developed-by: NJosh Don <joshdon@google.com> Signed-off-by: NJosh Don <joshdon@google.com> Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210603013136.370918-1-joel@joelfernandes.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 7b419f47 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7b419f47facd286c6723daca6ad69ec355473f78 -------------------------------------------------------------------------- Hugh noted that the SCHED_CORE Kconfig option could do with a help text. Requested-by: NHugh Dickins <hughd@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NHugh Dickins <hughd@google.com> Link: https://lkml.kernel.org/r/YKyhtwhEgvtUDOyl@hirez.programming.kicks-ass.netSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ingo Molnar 提交于
mainline inclusion from mainline-v5.14-rc1 commit cc00c198 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc00c1988801dc71f63bb7bad019e85046865095 -------------------------------------------------------------------------- A few more snuck in. Also capitalize 'CPU' while at it. Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Arnaldo Carvalho de Melo 提交于
mainline inclusion from mainline-v5.16-rc1 commit 49024204 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=49024204322cbfff892a28a67ad813cd41b6be81 -------------------------------------------------------------------------- To pick the changes in: 61bc346c ("uapi/linux/prctl: provide macro definitions for the PR_SCHED_CORE type argument") That don't result in any changes in tooling: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after $ Just silences this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Chris Hyser 提交于
mainline inclusion from mainline-v5.14-rc1 commit 9f269900 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f26990074931bbf797373e53104216059b300b1 -------------------------------------------------------------------------- Provides a selftest and examples of using the interface. [peterz: updated to not use sched_debug] Signed-off-by: NChris Hyser <chris.hyser@oracle.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123309.100860030@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Chris Hyser 提交于
mainline inclusion from mainline-v5.14-rc1 commit 7ac592aa category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7ac592aa35a684ff1858fb9ec282886b9e3575ac -------------------------------------------------------------------------- This patch provides support for setting and copying core scheduling 'task cookies' between threads (PID), processes (TGID), and process groups (PGID). The value of core scheduling isn't that tasks don't share a core, 'nosmt' can do that. The value lies in exploiting all the sharing opportunities that exist to recover possible lost performance and that requires a degree of flexibility in the API. From a security perspective (and there are others), the thread, process and process group distinction is an existent hierarchal categorization of tasks that reflects many of the security concerns about 'data sharing'. For example, protecting against cache-snooping by a thread that can just read the memory directly isn't all that useful. With this in mind, subcommands to CREATE/SHARE (TO/FROM) provide a mechanism to create and share cookies. CREATE/SHARE_TO specify a target pid with enum pidtype used to specify the scope of the targeted tasks. For example, PIDTYPE_TGID will share the cookie with the process and all of it's threads as typically desired in a security scenario. API: prctl(PR_SCHED_CORE, PR_SCHED_CORE_GET, tgtpid, pidtype, &cookie) prctl(PR_SCHED_CORE, PR_SCHED_CORE_CREATE, tgtpid, pidtype, NULL) prctl(PR_SCHED_CORE, PR_SCHED_CORE_SHARE_TO, tgtpid, pidtype, NULL) prctl(PR_SCHED_CORE, PR_SCHED_CORE_SHARE_FROM, srcpid, pidtype, NULL) where 'tgtpid/srcpid == 0' implies the current process and pidtype is kernel enum pid_type {PIDTYPE_PID, PIDTYPE_TGID, PIDTYPE_PGID, ...}. For return values, EINVAL, ENOMEM are what they say. ESRCH means the tgtpid/srcpid was not found. EPERM indicates lack of PTRACE permission access to tgtpid/srcpid. ENODEV indicates your machines lacks SMT. [peterz: complete rewrite] Signed-off-by: NChris Hyser <chris.hyser@oracle.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123309.039845339@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 85dd3f61 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=85dd3f61203c5cfa72b308ff327b5fbf3fc1ce5e -------------------------------------------------------------------------- Note that sched_core_fork() is called from under tasklist_lock, and not from sched_fork() earlier. This avoids a few races later. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.980003687@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 6e33cad0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6e33cad0af49336952e5541464bd02f5b5fd433e -------------------------------------------------------------------------- In order to not have to use pid_struct, create a new, smaller, structure to manage task cookies for core scheduling. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.919768100@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Aubrey Li 提交于
mainline inclusion from mainline-v5.14-rc1 commit 97886d9d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97886d9dcd86820bdbc1fa73b455982809cbc8c2 -------------------------------------------------------------------------- - Don't migrate if there is a cookie mismatch Load balance tries to move task from busiest CPU to the destination CPU. When core scheduling is enabled, if the task's cookie does not match with the destination CPU's core cookie, this task may be skipped by this CPU. This mitigates the forced idle time on the destination CPU. - Select cookie matched idle CPU In the fast path of task wakeup, select the first cookie matched idle CPU instead of the first idle CPU. - Find cookie matched idlest CPU In the slow path of task wakeup, find the idlest CPU whose core cookie matches with task's cookie Signed-off-by: NAubrey Li <aubrey.li@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.860083871@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit d2dfa17b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d2dfa17bc7de67e99685c4d6557837bf801a102c -------------------------------------------------------------------------- When a sibling is forced-idle to match the core-cookie; search for matching tasks to fill the core. rcu_read_unlock() can incur an infrequent deadlock in sched_core_balance(). Fix this by using the RCU-sched flavor instead. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.800048269@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Joel Fernandes (Google) 提交于
mainline inclusion from mainline-v5.14-rc1 commit c6047c2e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6047c2e3af68dae23ad884249e0d42ff28d2d1b -------------------------------------------------------------------------- During force-idle, we end up doing cross-cpu comparison of vruntimes during pick_next_task. If we simply compare (vruntime-min_vruntime) across CPUs, and if the CPUs only have 1 task each, we will always end up comparing 0 with 0 and pick just one of the tasks all the time. This starves the task that was not picked. To fix this, take a snapshot of the min_vruntime when entering force idle and use it for comparison. This min_vruntime snapshot will only be used for cross-CPU vruntime comparison, and nothing else. A note about the min_vruntime snapshot and force idling: During selection: When we're not fi, we need to update snapshot. when we're fi and we were not fi, we must update snapshot. When we're fi and we were already fi, we must not update snapshot. Which gives: fib fi update 0 0 1 0 1 1 1 0 1 1 1 0 Where: fi: force-idled now fib: force-idled before So the min_vruntime snapshot needs to be updated when: !(fib && fi). Also, the cfs_prio_less() function needs to be aware of whether the core is in force idle or not, since it will be use this information to know whether to advance a cfs_rq's min_vruntime_fi in the hierarchy. So pass this information along via pick_task() -> prio_less(). Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.738542617@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Joel Fernandes (Google) 提交于
mainline inclusion from mainline-v5.14-rc1 commit 7afbba11 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7afbba119f0da09824d723f8081608ea1f74ff57 -------------------------------------------------------------------------- The rationale is as follows. In the core-wide pick logic, even if need_sync == false, we need to go look at other CPUs (non-local CPUs) to see if they could be running RT. Say the RQs in a particular core look like this: Let CFS1 and CFS2 be 2 tagged CFS tags. Let RT1 be an untagged RT task. rq0 rq1 CFS1 (tagged) RT1 (no tag) CFS2 (tagged) Say schedule() runs on rq0. Now, it will enter the above loop and pick_task(RT) will return NULL for 'p'. It will enter the above if() block and see that need_sync == false and will skip RT entirely. The end result of the selection will be (say prio(CFS1) > prio(CFS2)): rq0 rq1 CFS1 IDLE When it should have selected: rq0 rq1 IDLE RT Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.678425748@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vineeth Pillai 提交于
mainline inclusion from mainline-v5.14-rc1 commit 8039e96f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8039e96fcc1de30d5bcaf05da9ca2de46a800826 -------------------------------------------------------------------------- If there is only one long running local task and the sibling is forced idle, it might not get a chance to run until a schedule event happens on any cpu in the core. So we check for this condition during a tick to see if a sibling is starved and then give it a chance to schedule. Signed-off-by: NVineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.617407840@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-