- 30 6月, 2023 1 次提交
-
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7HFZV CVE: NA ---------------------------------------- There may be NULL pointer derefrence when hotplug running and creating taskgroup concurrently. sched_autogroup_create_attach -> sched_create_group -> alloc_fair_sched_group -> init_auto_affinity -> init_affinity_domains -> cpumask_copy(xx, sched_domain_span(tmp)) { tmp may be free due rcu lock missing } { hotplug will rebuild sched domain } sched_cpu_activate -> build_sched_domains -> cpuset_cpu_active -> partition_sched_domains -> build_sched_domains -> cpu_attach_domain -> destroy_sched_domains -> call_rcu(&sd->rcu, destroy_sched_domains_rcu) So sd should be protect with rcu lock in entire critical zone. [ 599.811593] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 600.112821] pc : init_affinity_domains+0xf4/0x200 [ 600.125918] lr : init_affinity_domains+0xd4/0x200 [ 600.331355] Call trace: [ 600.338734] init_affinity_domains+0xf4/0x200 [ 600.347955] init_auto_affinity+0x78/0xc0 [ 600.356622] alloc_fair_sched_group+0xd8/0x210 [ 600.365594] sched_create_group+0x48/0xc0 [ 600.373970] sched_autogroup_create_attach+0x54/0x190 [ 600.383311] ksys_setsid+0x110/0x130 [ 600.391014] __arm64_sys_setsid+0x18/0x24 [ 600.399156] el0_svc_common+0x118/0x170 [ 600.406818] el0_svc_handler+0x3c/0x80 [ 600.414188] el0_svc+0x8/0x640 [ 600.420719] Code: b40002c0 9104e002 f9402061 a9401444 (a9001424) [ 600.430504] SMP: stopping secondary CPUs [ 600.441751] Starting crashdump kernel... Fixes: 713cfd26 ("sched: Introduce smart grid scheduling strategy for cfs") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
- 25 6月, 2023 2 次提交
-
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FBJM CVE: NA ---------------------------------------- Free ad->domains_orig[] in 'free_affinity_domains', otherwise the memory will leak. Fixes: 713cfd26 ("sched: Introduce smart grid scheduling strategy for cfs") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7F7KV CVE: NA ------------------------------- Delete redundant updates to p->prefer_cpus when smart grid used. Add missed check for p->prefer_cpus when !CONFIG_QOS_SCHED_SMART_GRID. Fixes: 21e5d85e ("sched: Fix possible deadlock in tg_set_dynamic_affinity_mode") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
- 20 6月, 2023 4 次提交
-
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7EEF3 CVE: NA ------------------------------- Adjust few parameters range for smart grid. Fixes: 713cfd26 ("sched: Introduce smart grid scheduling strategy for cfs") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7EBNA CVE: NA ------------------------------- Fix memory leak on error branch for smart grid. Fixes: 713cfd26 ("sched: Introduce smart grid scheduling strategy for cfs") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7EA1X CVE: NA ------------------------------- tg->auto_affinity is NULL if init_auto_affinity() failed. So add checking for tg->auto_affinity before derefrence. Fixes: 713cfd26 ("sched: Introduce smart grid scheduling strategy for cfs") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7DSX6 CVE: NA ------------------------------- Timer storm may be triggered if !cpumask_weight(ad->domains[i]) which is set in cpu offline. Fixes: 713cfd26 ("sched: Introduce smart grid scheduling strategy for cfs") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
- 19 6月, 2023 1 次提交
-
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7DX9Y CVE: NA ------------------------------- A warn may be triggered during reboot, as follows: reboot ->kernel_restart ->machine_restart ->smp_send_stop --- ipi handler set_cpu_online(cpu, false) balance_callback -> __balance_callback ->push_rt_task -> find_lock_lowest_rq <从vec->mask获取的rq> -> find_lowest_rq -> cpupri_find -> cpupri_find_fitness -> __cpupri_find [cpumask_and(..., vec->mask)] -> set_task_cpu(next_task, lowest_rq->cpu) --- WARN_ON(!oneline(cpu) So add !cpu_online(lowest_rq->cpu) check before set_task_cpu(). The fix does not completely fix the problem, since cpu_online_mask may be cleared after check. Fixes: 4ff9083b ("sched/core: WARN() when migrating to an offline CPU") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 15 6月, 2023 5 次提交
-
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7DA63 CVE: NA -------------------------------- Add mutex lock to prevent negative count for jump label. [28612.530675] ------------[ cut here ]------------ [28612.532708] jump label: negative count! [28612.535031] WARNING: CPU: 4 PID: 3899 at kernel/jump_label.c:202 __static_key_slow_dec_cpuslocked+0x204/0x240 [28612.538216] Kernel panic - not syncing: panic_on_warn set ... [28612.538216] [28612.540487] CPU: 4 PID: 3899 Comm: sh Kdump: loaded Not tainted [28612.542788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) [28612.546455] Call Trace: [28612.547339] dump_stack+0xc6/0x11e [28612.548546] ? __static_key_slow_dec_cpuslocked+0x200/0x240 [28612.550352] panic+0x1d6/0x46b [28612.551375] ? refcount_error_report+0x2a5/0x2a5 [28612.552915] ? kmsg_dump_rewind_nolock+0xde/0xde [28612.554358] ? sched_clock_cpu+0x18/0x1b0 [28612.555699] ? __warn+0x1d1/0x210 [28612.556799] ? __static_key_slow_dec_cpuslocked+0x204/0x240 [28612.558548] __warn+0x1ec/0x210 [28612.559621] ? __static_key_slow_dec_cpuslocked+0x204/0x240 [28612.561536] report_bug+0x1ee/0x2b0 [28612.562706] fixup_bug.part.4+0x37/0x80 [28612.563937] do_error_trap+0x21c/0x260 [28612.565109] ? fixup_bug.part.4+0x80/0x80 [28612.566453] ? check_preemption_disabled+0x34/0x1f0 [28612.567991] ? trace_hardirqs_off_thunk+0x1a/0x1c [28612.569534] ? lockdep_hardirqs_off+0x1cb/0x2b0 [28612.570993] ? error_entry+0x9a/0x130 [28612.572138] ? trace_hardirqs_off_caller+0x59/0x1a0 [28612.573710] ? trace_hardirqs_off_thunk+0x1a/0x1c [28612.575232] invalid_op+0x14/0x20 [root@lo[ca2lh8ost6 12.576387] ? vprintk_func+0x68/0x1a0 [28612.577827] ? __static_key_slow_dec_cpuslocked+0x204/0x240 smartg[ri2d]8# 612.579662] ? __static_key_slow_dec_cpuslocked+0x204/0x240 [28612.581781] ? static_key_disable+0x30/0x30 [28612.583248] ? s tatic_key_slow_dec+0x57/0x90 [28612.584997] ? tg_set_dynamic_affinity_mode+0x42/0x70 [28612.586714] ? cgroup_file_write+0x471/0x6a0 [28612.588162] ? cgroup_css.part.4+0x100/0x100 [28612.589579] ? cgroup_css.part.4+0x100/0x100 [28612.591031] ? kernfs_fop_write+0x2af/0x430 [28612.592625] ? kernfs_vma_page_mkwrite+0x230/0x230 [28612.594274] ? __vfs_write+0xef/0x680 [28612.595590] ? kernel_read+0x110/0x110 ea8612.596899] ? check_preemption_disabled+0x3mkd4ir/: 0canxno1t fcr0 Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7CGD0 CVE: NA ---------------------------------------- Deadlock occurs in two situations as follows: The first case: tg_set_dynamic_affinity_mode --- raw_spin_lock_irq(&auto_affi->lock); ->start_auto_affintiy --- trigger timer ->tg_update_task_prefer_cpus >css_task_inter_next ->raw_spin_unlock_irq hr_timer_run_queues ->sched_auto_affi_period_timer --- try spin lock (&auto_affi->lock) The second case as follows: [ 291.470810] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 291.472715] rcu: 1-...0: (0 ticks this GP) idle=a6a/1/0x4000000000000002 softirq=78516/78516 fqs=5249 [ 291.475268] rcu: (detected by 6, t=21006 jiffies, g=202169, q=9862) [ 291.477038] Sending NMI from CPU 6 to CPUs 1: [ 291.481268] NMI backtrace for cpu 1 [ 291.481273] CPU: 1 PID: 1923 Comm: sh Kdump: loaded Not tainted 4.19.90+ #150 [ 291.481278] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 291.481281] RIP: 0010:queued_spin_lock_slowpath+0x136/0x9a0 [ 291.481289] Code: c0 74 3f 49 89 dd 48 89 dd 48 b8 00 00 00 00 00 fc ff df 49 c1 ed 03 83 e5 07 49 01 c5 83 c5 03 48 83 05 c4 66 b9 05 01 f3 90 <41> 0f b6 45 00 40 38 c5 7c 08 84 c0 0f 85 ad 07 00 00 0 [ 291.481292] RSP: 0018:ffff88801de87cd8 EFLAGS: 00000002 [ 291.481297] RAX: 0000000000000101 RBX: ffff888001be0a28 RCX: ffffffffb8090f7d [ 291.481301] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888001be0a28 [ 291.481304] RBP: 0000000000000003 R08: ffffed100037c146 R09: ffffed100037c146 [ 291.481307] R10: 000000001106b143 R11: ffffed100037c145 R12: 1ffff11003bd0f9c [ 291.481311] R13: ffffed100037c145 R14: fffffbfff7a38dee R15: dffffc0000000000 [ 291.481315] FS: 00007fac4f306740(0000) GS:ffff88801de80000(0000) knlGS:0000000000000000 [ 291.481318] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 291.481321] CR2: 00007fac4f4bb650 CR3: 00000000046b6000 CR4: 00000000000006e0 [ 291.481323] Call Trace: [ 291.481324] <IRQ> [ 291.481326] ? osq_unlock+0x2a0/0x2a0 [ 291.481329] ? check_preemption_disabled+0x4c/0x290 [ 291.481331] ? rcu_accelerate_cbs+0x33/0xed0 [ 291.481333] _raw_spin_lock_irqsave+0x83/0xa0 [ 291.481336] sched_auto_affi_period_timer+0x251/0x820 [ 291.481338] ? __remove_hrtimer+0x151/0x200 [ 291.481340] __hrtimer_run_queues+0x39d/0xa50 [ 291.481343] ? tg_update_affinity_domain_down+0x460/0x460 [ 291.481345] ? enqueue_hrtimer+0x2e0/0x2e0 [ 291.481348] ? ktime_get_update_offsets_now+0x1d7/0x2c0 [ 291.481350] hrtimer_run_queues+0x243/0x470 [ 291.481352] run_local_timers+0x5e/0x150 [ 291.481354] update_process_times+0x36/0xb0 [ 291.481357] tick_sched_handle.isra.4+0x7c/0x180 [ 291.481359] tick_nohz_handler+0xd1/0x1d0 [ 291.481365] smp_apic_timer_interrupt+0x12c/0x4e0 [ 291.481368] apic_timer_interrupt+0xf/0x20 [ 291.481370] </IRQ> [ 291.481372] ? smp_call_function_many+0x68c/0x840 [ 291.481375] ? smp_call_function_many+0x6ab/0x840 [ 291.481377] ? arch_unregister_cpu+0x60/0x60 [ 291.481379] ? native_set_fixmap+0x100/0x180 [ 291.481381] ? arch_unregister_cpu+0x60/0x60 [ 291.481384] ? set_task_select_cpus+0x116/0x940 [ 291.481386] ? smp_call_function+0x53/0xc0 [ 291.481388] ? arch_unregister_cpu+0x60/0x60 [ 291.481390] ? on_each_cpu+0x49/0xf0 [ 291.481393] ? set_task_select_cpus+0x115/0x940 [ 291.481395] ? text_poke_bp+0xff/0x180 [ 291.481397] ? poke_int3_handler+0xc0/0xc0 [ 291.481400] ? __set_prefer_cpus_ptr.constprop.4+0x1cd/0x900 [ 291.481402] ? hrtick+0x1b0/0x1b0 [ 291.481404] ? set_task_select_cpus+0x115/0x940 [ 291.481407] ? __jump_label_transform.isra.0+0x3a1/0x470 [ 291.481409] ? kernel_init+0x280/0x280 [ 291.481411] ? kasan_check_read+0x1d/0x30 [ 291.481413] ? mutex_lock+0x96/0x100 [ 291.481415] ? __mutex_lock_slowpath+0x30/0x30 [ 291.481418] ? arch_jump_label_transform+0x52/0x80 [ 291.481420] ? set_task_select_cpus+0x115/0x940 [ 291.481422] ? __jump_label_update+0x1a1/0x1e0 [ 291.481424] ? jump_label_update+0x2ee/0x3b0 [ 291.481427] ? static_key_slow_inc_cpuslocked+0x1c8/0x2d0 [ 291.481430] ? start_auto_affinity+0x190/0x200 [ 291.481432] ? tg_set_dynamic_affinity_mode+0xad/0xf0 [ 291.481435] ? cpu_affinity_mode_write_u64+0x22/0x30 [ 291.481437] ? cgroup_file_write+0x46f/0x660 [ 291.481439] ? cgroup_init_cftypes+0x300/0x300 [ 291.481441] ? __mutex_lock_slowpath+0x30/0x30 Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7BQZ0 CVE: NA ---------------------------------------- The WARNING report when run: echo 1 > /sys/fs/cgroup/cpu/cpu.dynamic_affinity_mode [ 147.276757] WARNING: CPU: 5 PID: 1770 at kernel/cpu.c:326 \ lockdep_assert_cpus_held+0xac/0xd0 [ 147.279670] Kernel panic - not syncing: panic_on_warn set ... [ 147.279670] [ 147.282211] CPU: 5 PID: 1770 Comm: bash Kdump: loaded Not tainted 4.19 [ 147.284796] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996).. [ 147.290963] Call Trace: [ 147.292459] dump_stack+0xc6/0x11e [ 147.294295] ? lockdep_assert_cpus_held+0xa0/0xd0 [ 147.296876] panic+0x1d6/0x46b [ 147.298591] ? refcount_error_report+0x2a5/0x2a5 [ 147.301131] ? kmsg_dump_rewind_nolock+0xde/0xde [ 147.303738] ? sched_clock_cpu+0x18/0x1b0 [ 147.305943] ? __warn+0x1d1/0x210 [ 147.307831] ? lockdep_assert_cpus_held+0xac/0xd0 [ 147.310469] __warn+0x1ec/0x210 [ 147.312271] ? lockdep_assert_cpus_held+0xac/0xd0 [ 147.314838] report_bug+0x1ee/0x2b0 [ 147.316798] fixup_bug.part.4+0x37/0x80 [ 147.318946] do_error_trap+0x21c/0x260 [ 147.321062] ? fixup_bug.part.4+0x80/0x80 [ 147.323253] ? check_preemption_disabled+0x34/0x1f0 [ 147.324886] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 147.326277] ? lockdep_hardirqs_off+0x1cb/0x2b0 [ 147.327505] ? error_entry+0x9a/0x130 [ 147.328523] ? trace_hardirqs_off_caller+0x59/0x1a0 [ 147.329844] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 147.331124] invalid_op+0x14/0x20 [ 147.332057] ? vprintk_func+0x68/0x1a0 [ 147.333082] ? lockdep_assert_cpus_held+0xac/0xd0 [ 147.334355] ? lockdep_assert_cpus_held+0xac/0xd0 [ 147.335624] ? static_key_slow_inc_cpuslocked+0x5a/0x230 [ 147.337079] ? tg_set_dynamic_affinity_mode+0x4f/0x70 [ 147.338444] ? cgroup_file_write+0x471/0x6a0 [ 147.339604] ? cgroup_css.part.4+0x100/0x100 [ 147.340782] ? cgroup_css.part.4+0x100/0x100 [ 147.341943] ? kernfs_fop_write+0x2af/0x430 [ 147.343083] ? kernfs_vma_page_mkwrite+0x230/0x230 [ 147.344401] ? __vfs_write+0xef/0x680 [ 147.345404] ? kernel_read+0x110/0x110 Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7D98G CVE: NA ---------------------------------------- smart_grid_usage_dec() should called when free taskgroup if the mode is auto. Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7A718 -------------------------------- Add static key to reduce noise when not enable dynamic affinity. There are better performance in some case, such for lmbench. Fixes: 243865da ("cpuset: Introduce new interface for scheduler ...") Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
- 09 6月, 2023 2 次提交
-
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7BQZ0 CVE: NA ---------------------------------------- As smart grid scheduling (SGS) may shrink resources and affect task QOS, We provide methods for evaluating task QOS in divided grid, we mainly focus on the following two aspects: 1. Evaluate whether (such as CPU or memory) resources meet our demand 2. Ensure the least impact when working with (cpufreq and cpuidle) governors For tackling this questions, we have summarized several sampling methods to obtain tasks' characteristics at same time reducing scheduling noise as much as possible: 1. we detected the key factors that how sensitive a process is in cpufreq or cpuidle adjustment, and to guide the cpufreq/cpuidle governor 2. We dynamically monitor process memory bandwidth and adjust memory allocation to minimize cross-remote memory access 3. We provide a variety of load tracking mechanisms to adapt to different types of task's load change --------------------------------- ----------------- | class A | | class B | | -------- -------- | | -------- | | | group0 | | group1 | |---| | group2 | |----------+ | -------- -------- | | -------- | | | CPU/memory sensitive type | | balance type | | ----------------+---------------- --------+-------- | v v | (target cpufreq) ------------------------------------------------------- | (sensitivity) | Not satisfied with QOS? | | --------------------------+---------------------------- | v v ------------------------------------------------------- ---------------- | expand or shrink resource |<--| energy model | ----------------------------+-------------------------- ---------------- v | ----------- ----------- ------------ v | | | | | | --------------- | GRID0 +--------+ GRID1 +--------+ GRID2 |<-- | governor | | | | | | | --------------- ----------- ----------- ------------ \ | / \ ------------------- / | pages migration | ------------------- We will introduce the energy model in the follow-up implementation, and consider the dynamic affinity adjustment between each divided grid in the runtime. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7BQZ0 CVE: NA ---------------------------------------- We want to dynamically expand or shrink the affinity range of tasks based on the CPU topology level while meeting the minimum resource requirements of tasks. We divide several level of affinity domains according to sched domains: level4 * SOCKET [ ] level3 * DIE [ ] level2 * MC [ ] [ ] level1 * SMT [ ] [ ] [ ] [ ] level0 * CPU 0 1 2 3 4 5 6 7 Whether users tend to choose power saving or performance will affect strategy of adjusting affinity, when selecting the power saving mode, we will choose a more appropriate affinity based on the energy model to reduce power consumption, while considering the QOS of resources such as CPU and memory consumption, for instance, if the current task CPU load is less than required, smart grid will judge whether to aggregate tasks together into a smaller range or not according to energy model. The main difference from EAS is that we pay more attention to the impact of power consumption brought by such as cpuidle and DVFS, and classify tasks to reduce interference and ensure resource QOS in each divided unit, which are more suitable for general-purpose on non-heterogeneous CPUs. -------- -------- -------- | group0 | | group1 | | group2 | -------- -------- -------- | | | v | v ---------------------+----- ----------------- | ---v-- | | | DIE0 | MC1 | | | DIE1 | ------ | | --------------------------- ----------------- We regularly count the resource satisfaction of groups, and adjust the affinity, scheduling balance and migrating memory will be considered based on memory location for better meetting resource requirements. Signed-off-by: NHui Tang <tanghui20@huawei.com> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
-
- 08 5月, 2023 1 次提交
-
-
由 Linus Torvalds 提交于
stable inclusion from stable-v4.19.280 commit 178ff87d2a0c2d3d74081e1c2efbb33b3487267d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM CVE: NA -------------------------------- [ Upstream commit 6015b1ac ] The getaffinity() system call uses 'cpumask_size()' to decide how big the CPU mask is - so far so good. It is indeed the allocation size of a cpumask. But the code also assumes that the whole allocation is initialized without actually doing so itself. That's wrong, because we might have fixed-size allocations (making copying and clearing more efficient), but not all of it is then necessarily used if 'nr_cpu_ids' is smaller. Having checked other users of 'cpumask_size()', they all seem to be ok, either using it purely for the allocation size, or explicitly zeroing the cpumask before using the size in bytes to copy it. See for example the ublk_ctrl_get_queue_affinity() function that uses the proper 'zalloc_cpumask_var()' to make sure that the whole mask is cleared, whether the storage is on the stack or if it was an external allocation. Fix this by just zeroing the allocation before using it. Do the same for the compat version of sched_getaffinity(), which had the same logic. Also, for consistency, make sched_getaffinity() use 'cpumask_bits()' to access the bits. For a cpumask_var_t, it ends up being a pointer to the same data either way, but it's just a good idea to treat it like you would a 'cpumask_t'. The compat case already did that. Reported-by: NRyan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/lkml/7d026744-6bd6-6827-0471-b5e8eae0be3f@arm.com/ Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 06 4月, 2023 3 次提交
-
-
由 Vincent Guittot 提交于
mainline inclusion from mainline-v6.3-rc4 commit a53ce18c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6TE76 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=a53ce18cacb477dd0513c607f187d16f0fa96f71 -------------------------------- Commit 829c1651 ("sched/fair: sanitize vruntime of entity being placed") fixes an overflowing bug, but ignore a case that se->exec_start is reset after a migration. For fixing this case, we delay the reset of se->exec_start after placing the entity which se->exec_start to detect long sleeping task. In order to take into account a possible divergence between the clock_task of 2 rqs, we increase the threshold to around 104 days. Fixes: 829c1651 ("sched/fair: sanitize vruntime of entity being placed") Originally-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NZhang Qiao <zhangqiao22@huawei.com> Link: https://lore.kernel.org/r/20230317160810.107988-1-vincent.guittot@linaro.orgSigned-off-by: NSongping Yu <yusongping@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Zhang Qiao 提交于
mainline inclusion from mainline-v6.3-rc4 commit 829c1651 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6TE76 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=829c1651e9c4a6f78398d3e67651cef9bb6b42cc -------------------------------- When a scheduling entity is placed onto cfs_rq, its vruntime is pulled to the base level (around cfs_rq->min_vruntime), so that the entity doesn't gain extra boost when placed backwards. However, if the entity being placed wasn't executed for a long time, its vruntime may get too far behind (e.g. while cfs_rq was executing a low-weight hog), which can inverse the vruntime comparison due to s64 overflow. This results in the entity being placed with its original vruntime way forwards, so that it will effectively never get to the cpu. To prevent that, ignore the vruntime of the entity being placed if it didn't execute for much longer than the characteristic sheduler time scale. [rkagan: formatted, adjusted commit log, comments, cutoff value] Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Co-developed-by: NRoman Kagan <rkagan@amazon.de> Signed-off-by: NRoman Kagan <rkagan@amazon.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230130122216.3555094-1-rkagan@amazon.deSigned-off-by: NSongping Yu <yusongping@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: Nchenhui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Songping Yu 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6TE76 CVE: NA -------------------------------- This reverts commit 3a44e838. Signed-off-by: NSongping Yu <yusongping@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: Nchenhui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 09 2月, 2023 1 次提交
-
-
由 Konstantin Khlebnikov 提交于
mainline inclusion from mainline-v5.7-rc1 commit b4fb015e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6DZRO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b4fb015eeff7f3e5518a7dbe8061169a3e2f2bc7 -------------------------------- Group RT scheduler contains protection against setting zero runtime for cgroup with RT tasks. Right now function tg_set_rt_bandwidth() iterates over all CPU cgroups and calls tg_has_rt_tasks() for any cgroup which runtime is zero (not only for changed one). Default RT runtime is zero, thus tg_has_rt_tasks() will is called for almost at CPU cgroups. This protection already is slightly racy: runtime limit could be changed between cpu_cgroup_can_attach() and cpu_cgroup_attach() because changing cgroup attribute does not lock cgroup_mutex while attach does not lock rt_constraints_mutex. Changing task scheduler class also races with changing rt runtime: check in __sched_setscheduler() isn't protected. Function tg_has_rt_tasks() iterates over all threads in the system. This gives NR_CGROUPS * NR_TASKS operations under single tasklist_lock locked for read tg_set_rt_bandwidth(). Any concurrent attempt of locking tasklist_lock for write (for example fork) will stuck with disabled irqs. This patch makes two optimizations: 1) Remove locking tasklist_lock and iterate only tasks in cgroup 2) Call tg_has_rt_tasks() iff rt runtime changes from non-zero to zero All changed code is under CONFIG_RT_GROUP_SCHED. Testcase: # mkdir /sys/fs/cgroup/cpu/test{1..10000} # echo 0 | tee /sys/fs/cgroup/cpu/test*/cpu.rt_runtime_us At the same time without patch fork time will be >100ms: # perf trace -e clone --duration 100 stress-ng --fork 1 Also remote ping will show timings >100ms caused by irq latency. Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/157996383820.4651.11292439232549211693.stgit@buzzSigned-off-by: NZhao Wenhui <zhaowenhui8@huawei.com> Reviewed-by: Nchenhui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 24 12月, 2022 1 次提交
-
-
由 Zhang Qiao 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I67BL1 CVE: NA ------------------------------- If a task sleep for long time, it maybe cause a s64 overflow issue at max_vruntime() and the task will be set an incorrect vruntime, lead to the task be starve. For fix it, we set the task's vruntime as cfs_rq->min_vruntime when wakeup. Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: Nsongping yu <yusongping@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 16 12月, 2022 1 次提交
-
-
由 Zhang Qiao 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I64OUS CVE: NA ------------------------------- When a cfs_rq throttled by qos, mark cfs_rq->throttled as 1, and cfs bw will unthrottled this cfs_rq by mistake, it cause a list_del_valid warning. So add macro QOS_THROTTLED(=2), when a cfs_rq is throttled by qos, we mark the cfs_rq->throttled as QOS_THROTTLED, will check the value of cfs_rq->throttled before unthrottle a cfs_rq. Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: Nzheng zucheng <zhengzucheng@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 07 9月, 2022 1 次提交
-
-
由 Nicolas Saenz Julienne 提交于
stable inclusion from stable-v4.19.256 commit 952146183df68a7fb29c3c29d56ffdc941fbdc39 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- [ Upstream commit 5c66d1b9 ] dequeue_task_rt() only decrements 'rt_rq->rt_nr_running' after having called sched_update_tick_dependency() preventing it from re-enabling the tick on systems that no longer have pending SCHED_RT tasks but have multiple runnable SCHED_OTHER tasks: dequeue_task_rt() dequeue_rt_entity() dequeue_rt_stack() dequeue_top_rt_rq() sub_nr_running() // decrements rq->nr_running sched_update_tick_dependency() sched_can_stop_tick() // checks rq->rt.rt_nr_running, ... __dequeue_rt_entity() dec_rt_tasks() // decrements rq->rt.rt_nr_running ... Every other scheduler class performs the operation in the opposite order, and sched_update_tick_dependency() expects the values to be updated as such. So avoid the misbehaviour by inverting the order in which the above operations are performed in the RT scheduler. Fixes: 76d92ac3 ("sched: Migrate sched to use new tick dependency mask model") Signed-off-by: NNicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NValentin Schneider <vschneid@redhat.com> Reviewed-by: NPhil Auld <pauld@redhat.com> Link: https://lore.kernel.org/r/20220628092259.330171-1-nsaenzju@redhat.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 15 8月, 2022 1 次提交
-
-
由 Hui Tang 提交于
hulk inclusion category: bugfix bugzilla: 187419, https://gitee.com/openeuler/kernel/issues/I5LIPL CVE: NA ------------------------------- do_el0_svc+0x50/0x11c arch/arm64/kernel/syscall.c:217 el0_svc+0x20/0x30 arch/arm64/kernel/entry-common.c:353 el0_sync_handler+0xe4/0x1e0 arch/arm64/kernel/entry-common.c:369 el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683 ================================================================== BUG: KASAN: null-ptr-deref in rq_of kernel/sched/sched.h:1118 [inline] BUG: KASAN: null-ptr-deref in unthrottle_qos_sched_group kernel/sched/fair.c:7619 [inline] BUG: KASAN: null-ptr-deref in free_fair_sched_group+0x124/0x320 kernel/sched/fair.c:12131 Read of size 8 at addr 0000000000000130 by task syz-executor100/223 CPU: 3 PID: 223 Comm: syz-executor100 Not tainted 5.10.0 #6 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x40c arch/arm64/kernel/stacktrace.c:132 show_stack+0x30/0x40 arch/arm64/kernel/stacktrace.c:196 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b4/0x248 lib/dump_stack.c:118 __kasan_report mm/kasan/report.c:551 [inline] kasan_report+0x18c/0x210 mm/kasan/report.c:564 check_memory_region_inline mm/kasan/generic.c:187 [inline] __asan_load8+0x98/0xc0 mm/kasan/generic.c:253 rq_of kernel/sched/sched.h:1118 [inline] unthrottle_qos_sched_group kernel/sched/fair.c:7619 [inline] free_fair_sched_group+0x124/0x320 kernel/sched/fair.c:12131 sched_free_group kernel/sched/core.c:7767 [inline] sched_create_group+0x48/0xc0 kernel/sched/core.c:7798 cpu_cgroup_css_alloc+0x18/0x40 kernel/sched/core.c:7930 css_create+0x7c/0x4a0 kernel/cgroup/cgroup.c:5328 cgroup_apply_control_enable+0x288/0x340 kernel/cgroup/cgroup.c:3135 cgroup_apply_control kernel/cgroup/cgroup.c:3217 [inline] cgroup_subtree_control_write+0x668/0x8b0 kernel/cgroup/cgroup.c:3375 cgroup_file_write+0x1a8/0x37c kernel/cgroup/cgroup.c:3909 kernfs_fop_write_iter+0x220/0x2f4 fs/kernfs/file.c:296 call_write_iter include/linux/fs.h:1960 [inline] new_sync_write+0x260/0x370 fs/read_write.c:515 vfs_write+0x3dc/0x4ac fs/read_write.c:602 ksys_write+0xfc/0x200 fs/read_write.c:655 __do_sys_write fs/read_write.c:667 [inline] __se_sys_write fs/read_write.c:664 [inline] __arm64_sys_write+0x50/0x60 fs/read_write.c:664 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common.constprop.0+0xf4/0x414 arch/arm64/kernel/syscall.c:155 do_el0_svc+0x50/0x11c arch/arm64/kernel/syscall.c:217 el0_svc+0x20/0x30 arch/arm64/kernel/entry-common.c:353 el0_sync_handler+0xe4/0x1e0 arch/arm64/kernel/entry-common.c:369 el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683 So add check for tg->cfs_rq[i] before unthrottle_qos_sched_group() called. Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 21 7月, 2022 4 次提交
-
-
由 Hui Tang 提交于
hulk inclusion category: feature bugzilla: 187173, https://gitee.com/openeuler/kernel/issues/I5G4IH CVE: NA -------------------------------- Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: feature bugzilla: 187173, https://gitee.com/openeuler/kernel/issues/I5G4IH CVE: NA -------------------------------- Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: feature bugzilla: 187173, https://gitee.com/openeuler/kernel/issues/I5G4IH CVE: NA -------------------------------- Compare taskgroup 'util_avg' in perferred cpu with capacity preferred cpu, dynamicly adjust cpu range for task wakeup process. Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Hui Tang 提交于
hulk inclusion category: feature bugzilla: 187173, https://gitee.com/openeuler/kernel/issues/I5G4IH CVE: NA -------------------------------- Add 'prefer_cpus' sysfs and related interface in cgroup cpuset. Signed-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 15 6月, 2022 2 次提交
-
-
由 Zhang Qiao 提交于
hulk inclusion category: bugfix bugzilla: 186973, https://gitee.com/openeuler/kernel/issues/I5CA6K CVE: NA -------------------------------- This reverts commit 74bd9b82. Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Zhang Qiao 提交于
hulk inclusion category: bugfix bugzilla: 186973, https://gitee.com/openeuler/kernel/issues/I5CA6K CVE: NA -------------------------------- This reverts commit af98db5f. the patch af98db5f("sched: Fix yet more sched_fork()") may be cause a process sleep at cgroup_post_fork()->freezer_fork() while taking group_threadgroup_rwsem lock long time, it cause a problem that other tasks will wait while fork child process and the system will stall. Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 01 6月, 2022 2 次提交
-
-
由 Phil Auld 提交于
mainline inclusion from mainline-v5.r7-rc6 commit b34cb07d category: bugfix bugzilla: 91404, https://gitee.com/openeuler/kernel/issues/I59VLJ CVE: NA -------------------------------- The recent patch, fe61468b (sched/fair: Fix enqueue_task_fair warning) did not fully resolve the issues with the rq->tmp_alone_branch != &rq->leaf_cfs_rq_list warning in enqueue_task_fair. There is a case where the first for_each_sched_entity loop exits due to on_rq, having incompletely updated the list. In this case the second for_each_sched_entity loop can further modify se. The later code to fix up the list management fails to do what is needed because se does not point to the sched_entity which broke out of the first loop. The list is not fixed up because the throttled parent was already added back to the list by a task enqueue in a parallel child hierarchy. Address this by calling list_add_leaf_cfs_rq if there are throttled parents while doing the second for_each_sched_entity loop. Fixes: fe61468b ("sched/fair: Fix enqueue_task_fair warning") Suggested-by: NVincent Guittot <vincent.guittot@linaro.org> Signed-off-by: NPhil Auld <pauld@redhat.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20200512135222.GC2201@lorien.usersys.redhat.comSigned-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Vincent Guittot 提交于
mainline inclusion from mainline-v5.6-rc4 commit fe61468b category: bugfix bugzilla: 93902, https://gitee.com/openeuler/kernel/issues/I59VLJ CVE: NA -------------------------------- When a cfs rq is throttled, the latter and its child are removed from the leaf list but their nr_running is not changed which includes staying higher than 1. When a task is enqueued in this throttled branch, the cfs rqs must be added back in order to ensure correct ordering in the list but this can only happens if nr_running == 1. When cfs bandwidth is used, we call unconditionnaly list_add_leaf_cfs_rq() when enqueuing an entity to make sure that the complete branch will be added. Similarly unthrottle_cfs_rq() can stop adding cfs in the list when a parent is throttled. Iterate the remaining entity to ensure that the complete branch will be added in the list. Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com> Tested-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Cc: stable@vger.kernel.org Cc: stable@vger.kernel.org #v5.1+ Link: https://lkml.kernel.org/r/20200306135257.25044-1-vincent.guittot@linaro.orgSigned-off-by: NHui Tang <tanghui20@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 23 5月, 2022 2 次提交
-
-
由 Zhang Qiao 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4VZJT CVE: NA -------------------------------- 1. Qos throttle reuse tg_{throttle,unthrottle}_{up,down} that can write some cfs-bandwidth fields, it may cause some unknown data error. So add qos_tg_{throttle,unthrottle}_{up,down} for qos throttle. 2. walk_tg_tree_from() caller must hold rcu_lock, currently there is none, so add it now. Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Zhang Qiao 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4VZJT CVE: NA -------------------------------- Before, when detect the cpu is overloaded, we throttle offline tasks at exit_to_user_mode_loop() before returning to user mode. Some architects(e.g.,arm64) do not support QOS scheduler because a task do not via exit_to_user_mode_loop() return to userspace at these platforms. In order to slove this problem and support qos scheduler on all architectures, if we require throttling offline tasks, we set flag TIF_NOTIFY_RESUME to an offline task when it is picked and throttle it at tracehook_notify_resume(). Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 19 4月, 2022 2 次提交
-
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.17-rc5 commit b1e82065 category: bugfix bugzilla: 186609, https://gitee.com/openeuler/kernel/issues/I532B0 CVE: NA -------------------------------- Where commit 4ef0c5c6 ("kernel/sched: Fix sched_fork() access an invalid sched_task_group") fixed a fork race vs cgroup, it opened up a race vs syscalls by not placing the task on the runqueue before it gets exposed through the pidhash. Commit 13765de8 ("sched/fair: Fix fault in reweight_entity") is trying to fix a single instance of this, instead fix the whole class of issues, effectively reverting this commit. Fixes: 4ef0c5c6 ("kernel/sched: Fix sched_fork() access an invalid sched_task_group") Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NTadeusz Struk <tadeusz.struk@linaro.org> Tested-by: NZhang Qiao <zhangqiao22@huawei.com> Tested-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lkml.kernel.org/r/YgoeCbwj5mbCR0qA@hirez.programming.kicks-ass.net conflict: include/linux/sched/task.h kernel/fork.c kernel/sched/core.c Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
由 Xunlei Pang 提交于
mainline inclusion from mainline-v5.10-rc1 commit df3cb4ea category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I52QKB CVE: NA -------------------------------- We've met problems that occasionally tasks with full cpumask (e.g. by putting it into a cpuset or setting to full affinity) were migrated to our isolated cpus in production environment. After some analysis, we found that it is due to the current select_idle_smt() not considering the sched_domain mask. Steps to reproduce on my 31-CPU hyperthreads machine: 1. with boot parameter: "isolcpus=domain,2-31" (thread lists: 0,16 and 1,17) 2. cgcreate -g cpu:test; cgexec -g cpu:test "test_threads" 3. some threads will be migrated to the isolated cpu16~17. Fix it by checking the valid domain mask in select_idle_smt(). Fixes: 10e2f1ac ("sched/core: Rewrite and improve select_idle_siblings()) Reported-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NJiang Biao <benbjiang@tencent.com> Reviewed-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/1600930127-76857-1-git-send-email-xlpang@linux.alibaba.com Conflicts: kernel/sched/fair.c Signed-off-by: NYu jiahua <yujiahua1@huawei.com> Reviewed-by: Nzheng zucheng <zhengzucheng@huawei.com> Reviewed-by: Nzheng zucheng <zhengzucheng@huawei.com> Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 02 4月, 2022 1 次提交
-
-
由 Zhang Qiao 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I50PPU CVE: NA ----------------------------------------------------------------- when unthrottle a cfs_rq at distribute_cfs_runtime(), another cpu may re-throttle this cfs_rq at qos_throttle_cfs_rq() before access the cfs_rq->throttle_list.next, but meanwhile, qos throttle will attach the cfs_rq throttle_list node to percpu qos_throttled_cfs_rq, it will change cfs_rq->throttle_list.next and cause panic or hardlockup at distribute_cfs_runtime(). Fix it by adding a qos_throttle_list node in struct cfs_rq, and qos throttle disuse the cfs_rq->throttle_list. Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: Nzheng zucheng <zhengzucheng@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
-
- 14 3月, 2022 2 次提交
-
-
由 Andrey Ryabinin 提交于
stable inclusion from linux-4.19.226 commit 952514c8565cf72a966993b473fae1708c3684f3 -------------------------------- commit 9731698e upstream. cpuacct.stat in no-root cgroups shows user time without guest time included int it. This doesn't match with user time shown in root cpuacct.stat and /proc/<pid>/stat. This also affects cgroup2's cpu.stat in the same way. Make account_guest_time() to add user time to cgroup's cpustat to fix this. Fixes: ef12fefa ("cpuacct: add per-cgroup utime/stime statistics") Signed-off-by: NAndrey Ryabinin <arbn@yandex-team.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211115164607.23784-1-arbn@yandex-team.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com> Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
-
由 Li Hua 提交于
stable inclusion from linux-4.19.226 commit e6bc7279b16517fab9ed3cdbd58ad7b08060c246 -------------------------------- [ Upstream commit 9b58e976 ] When rt_runtime is modified from -1 to a valid control value, it may cause the task to be throttled all the time. Operations like the following will trigger the bug. E.g: 1. echo -1 > /proc/sys/kernel/sched_rt_runtime_us 2. Run a FIFO task named A that executes while(1) 3. echo 950000 > /proc/sys/kernel/sched_rt_runtime_us When rt_runtime is -1, The rt period timer will not be activated when task A enqueued. And then the task will be throttled after setting rt_runtime to 950,000. The task will always be throttled because the rt period timer is not activated. Fixes: d0b27fa7 ("sched: rt-group: synchonised bandwidth period") Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NLi Hua <hucool.lihua@huawei.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211203033618.11895-1-hucool.lihua@huawei.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com> Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
-
- 11 3月, 2022 1 次提交
-
-
由 Zhang Qiao 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4WOPM CVE: NA -------------------------------- cfs_bandwidth_usage_inc() need hold jump_label_mutex and might sleep, so we can not call it in atomic context. Fix this by moving cfs_bandwidth_usage_{inc,dec}() out of rcu read critical section. Fixes: f7b390cd ("sched: Change cgroup task scheduler policy") Signed-off-by: NZhang Qiao <zhangqiao22@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
-