1. 27 6月, 2023 3 次提交
  2. 26 6月, 2023 2 次提交
  3. 25 6月, 2023 2 次提交
  4. 21 6月, 2023 5 次提交
  5. 20 6月, 2023 6 次提交
  6. 19 6月, 2023 2 次提交
  7. 16 6月, 2023 2 次提交
  8. 15 6月, 2023 9 次提交
    • H
      sched: Fix negative count for jump label · cde6dbb8
      Hui Tang 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7DA63
      CVE: NA
      
      --------------------------------
      
      Add mutex lock to prevent negative count for jump label.
      
      [28612.530675] ------------[ cut here ]------------
      [28612.532708] jump label: negative count!
      [28612.535031] WARNING: CPU: 4 PID: 3899 at kernel/jump_label.c:202
      	__static_key_slow_dec_cpuslocked+0x204/0x240
      [28612.538216] Kernel panic - not syncing: panic_on_warn set ...
      [28612.538216]
      [28612.540487] CPU: 4 PID: 3899 Comm: sh Kdump: loaded Not tainted
      [28612.542788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      [28612.546455] Call Trace:
      [28612.547339]  dump_stack+0xc6/0x11e
      [28612.548546]  ? __static_key_slow_dec_cpuslocked+0x200/0x240
      [28612.550352]  panic+0x1d6/0x46b
      [28612.551375]  ? refcount_error_report+0x2a5/0x2a5
      [28612.552915]  ? kmsg_dump_rewind_nolock+0xde/0xde
      [28612.554358]  ? sched_clock_cpu+0x18/0x1b0
      [28612.555699]  ? __warn+0x1d1/0x210
      [28612.556799]  ? __static_key_slow_dec_cpuslocked+0x204/0x240
      [28612.558548]  __warn+0x1ec/0x210
      [28612.559621]  ? __static_key_slow_dec_cpuslocked+0x204/0x240
      [28612.561536]  report_bug+0x1ee/0x2b0
      [28612.562706]  fixup_bug.part.4+0x37/0x80
      [28612.563937]  do_error_trap+0x21c/0x260
      [28612.565109]  ? fixup_bug.part.4+0x80/0x80
      [28612.566453]  ? check_preemption_disabled+0x34/0x1f0
      [28612.567991]  ? trace_hardirqs_off_thunk+0x1a/0x1c
      [28612.569534]  ? lockdep_hardirqs_off+0x1cb/0x2b0
      [28612.570993]  ? error_entry+0x9a/0x130
      [28612.572138]  ? trace_hardirqs_off_caller+0x59/0x1a0
      [28612.573710]  ? trace_hardirqs_off_thunk+0x1a/0x1c
      [28612.575232]  invalid_op+0x14/0x20
      [root@lo[ca2lh8ost6 12.576387]  ? vprintk_func+0x68/0x1a0
      [28612.577827]  ? __static_key_slow_dec_cpuslocked+0x204/0x240
      smartg[ri2d]8# 612.579662]  ? __static_key_slow_dec_cpuslocked+0x204/0x240
      [28612.581781]  ? static_key_disable+0x30/0x30
      [28612.583248]  ? s
      tatic_key_slow_dec+0x57/0x90
      [28612.584997]  ? tg_set_dynamic_affinity_mode+0x42/0x70
      [28612.586714]  ? cgroup_file_write+0x471/0x6a0
      [28612.588162]  ? cgroup_css.part.4+0x100/0x100
      [28612.589579]  ? cgroup_css.part.4+0x100/0x100
      [28612.591031]  ? kernfs_fop_write+0x2af/0x430
      [28612.592625]  ? kernfs_vma_page_mkwrite+0x230/0x230
      [28612.594274]  ? __vfs_write+0xef/0x680
      [28612.595590]  ? kernel_read+0x110/0x110
      ea8612.596899]  ? check_preemption_disabled+0x3mkd4ir/: 0canxno1t fcr0
      Signed-off-by: NHui Tang <tanghui20@huawei.com>
      Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
      Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
      cde6dbb8
    • H
      sched: Fix possible deadlock in tg_set_dynamic_affinity_mode · 21e5d85e
      Hui Tang 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7CGD0
      CVE: NA
      
      ----------------------------------------
      
      Deadlock occurs in two situations as follows:
      
      The first case:
      
      tg_set_dynamic_affinity_mode    --- raw_spin_lock_irq(&auto_affi->lock);
      	->start_auto_affintiy   --- trigger timer
      		->tg_update_task_prefer_cpus
      			>css_task_inter_next
      				->raw_spin_unlock_irq
      
      hr_timer_run_queues
        ->sched_auto_affi_period_timer --- try spin lock (&auto_affi->lock)
      
      The second case as follows:
      
      [  291.470810] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
      [  291.472715] rcu:     1-...0: (0 ticks this GP) idle=a6a/1/0x4000000000000002 softirq=78516/78516 fqs=5249
      [  291.475268] rcu:     (detected by 6, t=21006 jiffies, g=202169, q=9862)
      [  291.477038] Sending NMI from CPU 6 to CPUs 1:
      [  291.481268] NMI backtrace for cpu 1
      [  291.481273] CPU: 1 PID: 1923 Comm: sh Kdump: loaded Not tainted 4.19.90+ #150
      [  291.481278] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
      [  291.481281] RIP: 0010:queued_spin_lock_slowpath+0x136/0x9a0
      [  291.481289] Code: c0 74 3f 49 89 dd 48 89 dd 48 b8 00 00 00 00 00 fc ff df 49 c1 ed 03 83 e5 07 49 01 c5 83 c5 03 48 83 05 c4 66 b9 05 01 f3 90 <41> 0f b6 45 00 40 38 c5 7c 08 84 c0 0f 85 ad 07 00 00 0
      [  291.481292] RSP: 0018:ffff88801de87cd8 EFLAGS: 00000002
      [  291.481297] RAX: 0000000000000101 RBX: ffff888001be0a28 RCX: ffffffffb8090f7d
      [  291.481301] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888001be0a28
      [  291.481304] RBP: 0000000000000003 R08: ffffed100037c146 R09: ffffed100037c146
      [  291.481307] R10: 000000001106b143 R11: ffffed100037c145 R12: 1ffff11003bd0f9c
      [  291.481311] R13: ffffed100037c145 R14: fffffbfff7a38dee R15: dffffc0000000000
      [  291.481315] FS:  00007fac4f306740(0000) GS:ffff88801de80000(0000) knlGS:0000000000000000
      [  291.481318] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  291.481321] CR2: 00007fac4f4bb650 CR3: 00000000046b6000 CR4: 00000000000006e0
      [  291.481323] Call Trace:
      [  291.481324]  <IRQ>
      [  291.481326]  ? osq_unlock+0x2a0/0x2a0
      [  291.481329]  ? check_preemption_disabled+0x4c/0x290
      [  291.481331]  ? rcu_accelerate_cbs+0x33/0xed0
      [  291.481333]  _raw_spin_lock_irqsave+0x83/0xa0
      [  291.481336]  sched_auto_affi_period_timer+0x251/0x820
      [  291.481338]  ? __remove_hrtimer+0x151/0x200
      [  291.481340]  __hrtimer_run_queues+0x39d/0xa50
      [  291.481343]  ? tg_update_affinity_domain_down+0x460/0x460
      [  291.481345]  ? enqueue_hrtimer+0x2e0/0x2e0
      [  291.481348]  ? ktime_get_update_offsets_now+0x1d7/0x2c0
      [  291.481350]  hrtimer_run_queues+0x243/0x470
      [  291.481352]  run_local_timers+0x5e/0x150
      [  291.481354]  update_process_times+0x36/0xb0
      [  291.481357]  tick_sched_handle.isra.4+0x7c/0x180
      [  291.481359]  tick_nohz_handler+0xd1/0x1d0
      [  291.481365]  smp_apic_timer_interrupt+0x12c/0x4e0
      [  291.481368]  apic_timer_interrupt+0xf/0x20
      [  291.481370]  </IRQ>
      [  291.481372]  ? smp_call_function_many+0x68c/0x840
      [  291.481375]  ? smp_call_function_many+0x6ab/0x840
      [  291.481377]  ? arch_unregister_cpu+0x60/0x60
      [  291.481379]  ? native_set_fixmap+0x100/0x180
      [  291.481381]  ? arch_unregister_cpu+0x60/0x60
      [  291.481384]  ? set_task_select_cpus+0x116/0x940
      [  291.481386]  ? smp_call_function+0x53/0xc0
      [  291.481388]  ? arch_unregister_cpu+0x60/0x60
      [  291.481390]  ? on_each_cpu+0x49/0xf0
      [  291.481393]  ? set_task_select_cpus+0x115/0x940
      [  291.481395]  ? text_poke_bp+0xff/0x180
      [  291.481397]  ? poke_int3_handler+0xc0/0xc0
      [  291.481400]  ? __set_prefer_cpus_ptr.constprop.4+0x1cd/0x900
      [  291.481402]  ? hrtick+0x1b0/0x1b0
      [  291.481404]  ? set_task_select_cpus+0x115/0x940
      [  291.481407]  ? __jump_label_transform.isra.0+0x3a1/0x470
      [  291.481409]  ? kernel_init+0x280/0x280
      [  291.481411]  ? kasan_check_read+0x1d/0x30
      [  291.481413]  ? mutex_lock+0x96/0x100
      [  291.481415]  ? __mutex_lock_slowpath+0x30/0x30
      [  291.481418]  ? arch_jump_label_transform+0x52/0x80
      [  291.481420]  ? set_task_select_cpus+0x115/0x940
      [  291.481422]  ? __jump_label_update+0x1a1/0x1e0
      [  291.481424]  ? jump_label_update+0x2ee/0x3b0
      [  291.481427]  ? static_key_slow_inc_cpuslocked+0x1c8/0x2d0
      [  291.481430]  ? start_auto_affinity+0x190/0x200
      [  291.481432]  ? tg_set_dynamic_affinity_mode+0xad/0xf0
      [  291.481435]  ? cpu_affinity_mode_write_u64+0x22/0x30
      [  291.481437]  ? cgroup_file_write+0x46f/0x660
      [  291.481439]  ? cgroup_init_cftypes+0x300/0x300
      [  291.481441]  ? __mutex_lock_slowpath+0x30/0x30
      Signed-off-by: NHui Tang <tanghui20@huawei.com>
      Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
      Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
      21e5d85e
    • H
      sched: fix WARN found by deadlock detect · 217edab9
      Hui Tang 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BQZ0
      CVE: NA
      
      ----------------------------------------
      
      The WARNING report when run:
      echo 1 > /sys/fs/cgroup/cpu/cpu.dynamic_affinity_mode
      
      [  147.276757] WARNING: CPU: 5 PID: 1770 at kernel/cpu.c:326 \
      	lockdep_assert_cpus_held+0xac/0xd0
      [  147.279670] Kernel panic - not syncing: panic_on_warn set ...
      [  147.279670]
      [  147.282211] CPU: 5 PID: 1770 Comm: bash Kdump: loaded Not tainted 4.19
      [  147.284796] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)..
      [  147.290963] Call Trace:
      [  147.292459]  dump_stack+0xc6/0x11e
      [  147.294295]  ? lockdep_assert_cpus_held+0xa0/0xd0
      [  147.296876]  panic+0x1d6/0x46b
      [  147.298591]  ? refcount_error_report+0x2a5/0x2a5
      [  147.301131]  ? kmsg_dump_rewind_nolock+0xde/0xde
      [  147.303738]  ? sched_clock_cpu+0x18/0x1b0
      [  147.305943]  ? __warn+0x1d1/0x210
      [  147.307831]  ? lockdep_assert_cpus_held+0xac/0xd0
      [  147.310469]  __warn+0x1ec/0x210
      [  147.312271]  ? lockdep_assert_cpus_held+0xac/0xd0
      [  147.314838]  report_bug+0x1ee/0x2b0
      [  147.316798]  fixup_bug.part.4+0x37/0x80
      [  147.318946]  do_error_trap+0x21c/0x260
      [  147.321062]  ? fixup_bug.part.4+0x80/0x80
      [  147.323253]  ? check_preemption_disabled+0x34/0x1f0
      [  147.324886]  ? trace_hardirqs_off_thunk+0x1a/0x1c
      [  147.326277]  ? lockdep_hardirqs_off+0x1cb/0x2b0
      [  147.327505]  ? error_entry+0x9a/0x130
      [  147.328523]  ? trace_hardirqs_off_caller+0x59/0x1a0
      [  147.329844]  ? trace_hardirqs_off_thunk+0x1a/0x1c
      [  147.331124]  invalid_op+0x14/0x20
      [  147.332057]  ? vprintk_func+0x68/0x1a0
      [  147.333082]  ? lockdep_assert_cpus_held+0xac/0xd0
      [  147.334355]  ? lockdep_assert_cpus_held+0xac/0xd0
      [  147.335624]  ? static_key_slow_inc_cpuslocked+0x5a/0x230
      [  147.337079]  ? tg_set_dynamic_affinity_mode+0x4f/0x70
      [  147.338444]  ? cgroup_file_write+0x471/0x6a0
      [  147.339604]  ? cgroup_css.part.4+0x100/0x100
      [  147.340782]  ? cgroup_css.part.4+0x100/0x100
      [  147.341943]  ? kernfs_fop_write+0x2af/0x430
      [  147.343083]  ? kernfs_vma_page_mkwrite+0x230/0x230
      [  147.344401]  ? __vfs_write+0xef/0x680
      [  147.345404]  ? kernel_read+0x110/0x110
      Signed-off-by: NHui Tang <tanghui20@huawei.com>
      Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
      Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
      217edab9
    • H
      sched: fix smart grid usage count · d9099163
      Hui Tang 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7D98G
      CVE: NA
      
      ----------------------------------------
      
      smart_grid_usage_dec() should called when free taskgroup
      if the mode is auto.
      Signed-off-by: NHui Tang <tanghui20@huawei.com>
      Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
      Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
      d9099163
    • H
      sched: Add static key to reduce noise · 373fd236
      Hui Tang 提交于
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7A718
      
      --------------------------------
      
      Add static key to reduce noise when not enable dynamic affinity.
      There are better performance in some case, such for lmbench.
      
      Fixes: 243865da ("cpuset: Introduce new interface for scheduler ...")
      Signed-off-by: NHui Tang <tanghui20@huawei.com>
      Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
      Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
      373fd236
    • D
      net: nsh: Use correct mac_offset to unwind gso skb in nsh_gso_segment() · 822fb46b
      Dong Chenchen 提交于
      stable inclusion
      from stable-v4.19.283
      commit d2309e0cb27b6871b273fbc1725e93be62570d86
      category: bugfix
      bugzilla: 188702, https://gitee.com/openeuler/kernel/issues/I7DUPI
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d2309e0cb27b6871b273fbc1725e93be62570d86
      
      --------------------------------
      
      [ Upstream commit c83b4938 ]
      
      As the call trace shows, skb_panic was caused by wrong skb->mac_header
      in nsh_gso_segment():
      
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      CPU: 3 PID: 2737 Comm: syz Not tainted 6.3.0-next-20230505 #1
      RIP: 0010:skb_panic+0xda/0xe0
      call Trace:
       skb_push+0x91/0xa0
       nsh_gso_segment+0x4f3/0x570
       skb_mac_gso_segment+0x19e/0x270
       __skb_gso_segment+0x1e8/0x3c0
       validate_xmit_skb+0x452/0x890
       validate_xmit_skb_list+0x99/0xd0
       sch_direct_xmit+0x294/0x7c0
       __dev_queue_xmit+0x16f0/0x1d70
       packet_xmit+0x185/0x210
       packet_snd+0xc15/0x1170
       packet_sendmsg+0x7b/0xa0
       sock_sendmsg+0x14f/0x160
      
      The root cause is:
      nsh_gso_segment() use skb->network_header - nhoff to reset mac_header
      in skb_gso_error_unwind() if inner-layer protocol gso fails.
      However, skb->network_header may be reset by inner-layer protocol
      gso function e.g. mpls_gso_segment. skb->mac_header reset by the
      inaccurate network_header will be larger than skb headroom.
      
      nsh_gso_segment
          nhoff = skb->network_header - skb->mac_header;
          __skb_pull(skb,nsh_len)
          skb_mac_gso_segment
              mpls_gso_segment
                  skb_reset_network_header(skb);//skb->network_header+=nsh_len
                  return -EINVAL;
          skb_gso_error_unwind
              skb_push(skb, nsh_len);
              skb->mac_header = skb->network_header - nhoff;
              // skb->mac_header > skb->headroom, cause skb_push panic
      
      Use correct mac_offset to restore mac_header and get rid of nhoff.
      
      Fixes: c411ed85 ("nsh: add GSO support")
      Reported-by: syzbot+632b5d9964208bfef8c0@syzkaller.appspotmail.com
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDong Chenchen <dongchenchen2@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NDong Chenchen <dongchenchen2@huawei.com>
      Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      822fb46b
    • O
      !1134 【openEuler-1.0-LTS】cpufreq:conservative: Fix load in fast_dbs_update() · f59c1a3e
      openeuler-ci-bot 提交于
      Merge Pull Request from: @xuesinian 
       
      Remove "dbs_update(policy)" for getting load in fast_dbs_update(), incoming "load" from cs_dbs_update().
      
      Load results are inaccurate after two consecutive updates, resulting in inaccurate frequency scaling.
      
      Related issue : #I7DJU2  
       
      Link:https://gitee.com/openeuler/kernel/pulls/1134 
      
      Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> 
      Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> 
      f59c1a3e
    • C
      firewire: fix potential uaf in outbound_phy_packet_callback() · 160e0014
      Chengfeng Ye 提交于
      stable inclusion
      from stable-v4.19.242
      commit 34380b5647f13fecb458fea9a3eb3d8b3a454709
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I7BYU9
      CVE: CVE-2023-3159
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=34380b5647f1
      
      --------------------------------
      
      commit b7c81f80 upstream.
      
      &e->event and e point to the same address, and &e->event could
      be freed in queue_event. So there is a potential uaf issue if
      we dereference e after calling queue_event(). Fix this by adding
      a temporary variable to maintain e->client in advance, this can
      avoid the potential uaf issue.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NChengfeng Ye <cyeaa@connect.ust.hk>
      Signed-off-by: NTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Link: https://lore.kernel.org/r/20220409041243.603210-2-o-takashi@sakamocchi.jpSigned-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NWei Li <liwei391@huawei.com>
      Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
      Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
      Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
      160e0014
    • X
      cpufreq: conservative: fix load in fast_dbs_update() · 0dfa77a2
      XueSinian 提交于
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7DJU2
      CVE: NA
      
      ----------------------------------------
      
      Remove "dbs_update(policy)" for getting load in fast_dbs_update(),
      incoming load from cs_dbs_update().
      
      Reason:
      Load results are inaccurate after two consecutive updates, resulting
      in inaccurate frequency scaling.
      
      Fixes: 75704b66 ("cpufreq: conservative: Add a switch to enable fast mode")
      Signed-off-by: NXue Sinian <xuesinian@huawei.com>
      0dfa77a2
  9. 12 6月, 2023 5 次提交
  10. 09 6月, 2023 2 次提交
    • W
      sched: smart grid: init sched_grid_qos structure on QOS purpose · ce35ded5
      Wang ShaoBo 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BQZ0
      CVE: NA
      
      ----------------------------------------
      
      As smart grid scheduling (SGS) may shrink resources and affect task QOS,
      We provide methods for evaluating task QOS in divided grid, we mainly
      focus on the following two aspects:
      
         1. Evaluate whether (such as CPU or memory) resources meet our demand
         2. Ensure the least impact when working with (cpufreq and cpuidle) governors
      
      For tackling this questions, we have summarized several sampling methods
      to obtain tasks' characteristics at same time reducing scheduling noise
      as much as possible:
      
        1. we detected the key factors that how sensitive a process is in cpufreq
           or cpuidle adjustment, and to guide the cpufreq/cpuidle governor
        2. We dynamically monitor process memory bandwidth and adjust memory
           allocation to minimize cross-remote memory access
        3. We provide a variety of load tracking mechanisms to adapt to different
           types of task's load change
      
           ---------------------------------     -----------------
          |            class A              |   |     class B     |
          |    --------        --------     |   |     --------    |
          |   | group0 |      | group1 |    |---|    | group2 |   |----------+
          |    --------        --------     |   |     --------    |          |
          |    CPU/memory sensitive type    |   |   balance type  |          |
           ----------------+----------------     --------+--------           |
                           v                             v                   | (target cpufreq)
           -------------------------------------------------------           | (sensitivity)
          |              Not satisfied with QOS?                  |          |
           --------------------------+----------------------------           |
                                     v                                       v
           -------------------------------------------------------     ----------------
          |              expand or shrink resource                |<--|  energy model  |
           ----------------------------+--------------------------     ----------------
                                       v                                     |
           -----------          -----------          ------------            v
          |           |        |           |        |            |     ---------------
          |   GRID0   +--------+   GRID1   +--------+   GRID2    |<-- |   governor    |
          |           |        |           |        |            |     ---------------
           -----------          -----------          ------------
                         \            |            /
                          \  -------------------  /
                            |  pages migration  |
                             -------------------
      
      We will introduce the energy model in the follow-up implementation, and consider
      the dynamic affinity adjustment between each divided grid in the runtime.
      Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com>
      Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
      ce35ded5
    • H
      sched: Introduce smart grid scheduling strategy for cfs · 713cfd26
      Hui Tang 提交于
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I7BQZ0
      CVE: NA
      
      ----------------------------------------
      
      We want to dynamically expand or shrink the affinity range of tasks
      based on the CPU topology level while meeting the minimum resource
      requirements of tasks.
      
      We divide several level of affinity domains according to sched domains:
      
      level4   * SOCKET  [                                                  ]
      level3   * DIE     [                             ]
      level2   * MC      [             ] [             ]
      level1   * SMT     [     ] [     ] [     ] [     ]
      level0   * CPU      0   1   2   3   4   5   6   7
      
      Whether users tend to choose power saving or performance will affect
      strategy of adjusting affinity, when selecting the power saving mode,
      we will choose a more appropriate affinity based on the energy model
      to reduce power consumption, while considering the QOS of resources
      such as CPU and memory consumption, for instance, if the current task
      CPU load is less than required, smart grid will judge whether to aggregate
      tasks together into a smaller range or not according to energy model.
      
      The main difference from EAS is that we pay more attention to the impact
      of power consumption brought by such as cpuidle and DVFS, and classify
      tasks to reduce interference and ensure resource QOS in each divided unit,
      which are more suitable for general-purpose on non-heterogeneous CPUs.
      
              --------        --------        --------
             | group0 |      | group1 |      | group2 |
              --------        --------        --------
      	   |                |              |
      	   v                |              v
             ---------------------+-----     -----------------
            |                  ---v--   |   |
            |       DIE0      |  MC1 |  |   |   DIE1
            |                  ------   |   |
             ---------------------------     -----------------
      
      We regularly count the resource satisfaction of groups, and adjust the
      affinity, scheduling balance and migrating memory will be considered
      based on memory location for better meetting resource requirements.
      Signed-off-by: NHui Tang <tanghui20@huawei.com>
      Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com>
      Reviewed-by: NChen Hui <judy.chenhui@huawei.com>
      Reviewed-by: NZhang Qiao <zhangqiao22@huawei.com>
      Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
      713cfd26
  11. 08 6月, 2023 2 次提交