- 30 9月, 2022 34 次提交
-
-
由 Phil Auld 提交于
mainline inclusion from mainline-v5.18-rc2 commit 5524cbb1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5524cbb1bfcdff0cad0aaa9f94e6092002a07259 -------------------------------------------------------------------------- Arm64 systems rely on store_cpu_topology() to call update_siblings_masks() to transfer the toplogy to the various cpu masks. This needs to be done before the call to notify_cpu_starting() which tells the scheduler about each cpu found, otherwise the core scheduling data structures are setup in a way that does not match the actual topology. With smt_mask not setup correctly we bail on `cpumask_weight(smt_mask) == 1` for !leaders in: notify_cpu_starting() cpuhp_invoke_callback_range() sched_cpu_starting() sched_core_cpu_starting() which leads to rq->core not being correctly set for !leader-rq's. Without this change stress-ng (which enables core scheduling in its prctl tests in newer versions -- i.e. with PR_SCHED_CORE support) causes a warning and then a crash (trimmed for legibility): [ 1853.805168] ------------[ cut here ]------------ [ 1853.809784] task_rq(b)->core != rq->core [ 1853.809792] WARNING: CPU: 117 PID: 0 at kernel/sched/fair.c:11102 cfs_prio_less+0x1b4/0x1c4 ... [ 1854.015210] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 ... [ 1854.231256] Call trace: [ 1854.233689] pick_next_task+0x3dc/0x81c [ 1854.237512] __schedule+0x10c/0x4cc [ 1854.240988] schedule_idle+0x34/0x54 Fixes: 9edeaea1 ("sched: Core-wide rq->lock") Signed-off-by: NPhil Auld <pauld@redhat.com> Reviewed-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: NDietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20220331153926.25742-1-pauld@redhat.comSigned-off-by: NWill Deacon <will@kernel.org> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.18-rc2 commit 386ef214 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=386ef214c3c6ab111d05e1790e79475363abaa05 -------------------------------------------------------------------------- try_steal_cookie() looks at task_struct::cpus_mask to decide if the task could be moved to `this' CPU. It ignores that the task might be in a migration disabled section while not on the CPU. In this case the task must not be moved otherwise per-CPU assumption are broken. Use is_cpu_allowed(), as suggested by Peter Zijlstra, to decide if the a task can be moved. Fixes: d2dfa17b ("sched: Trivial forced-newidle balancer") Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/YjNK9El+3fzGmswf@linutronix.deSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.18-rc2 commit 5b6547ed category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b6547ed97f4f5dfc23f8e3970af6d11d7b7ed7e -------------------------------------------------------------------------- Steve reported that ChromeOS encounters the forceidle balancer being ran from rt_mutex_setprio()'s balance_callback() invocation and explodes. Now, the forceidle balancer gets queued every time the idle task gets selected, set_next_task(), which is strictly too often. rt_mutex_setprio() also uses set_next_task() in the 'change' pattern: queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */ running = task_current(rq, p); /* rq->curr == p */ if (queued) dequeue_task(...); if (running) put_prev_task(...); /* change task properties */ if (queued) enqueue_task(...); if (running) set_next_task(...); However, rt_mutex_setprio() will explicitly not run this pattern on the idle task (since priority boosting the idle task is quite insane). Most other 'change' pattern users are pidhash based and would also not apply to idle. Also, the change pattern doesn't contain a __balance_callback() invocation and hence we could have an out-of-band balance-callback, which *should* trigger the WARN in rq_pin_lock() (which guards against this exact anti-pattern). So while none of that explains how this happens, it does indicate that having it in set_next_task() might not be the most robust option. Instead, explicitly queue the forceidle balancer from pick_next_task() when it does indeed result in forceidle selection. Having it here, ensures it can only be triggered under the __schedule() rq->lock instance, and hence must be ran from that context. This also happens to clean up the code a little, so win-win. Fixes: d2dfa17b ("sched: Trivial forced-newidle balancer") Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NT.J. Alumbaugh <talumbau@chromium.org> Link: https://lkml.kernel.org/r/20220330160535.GN8939@worktop.programming.kicks-ass.netSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Shaokun Zhang 提交于
mainline inclusion from mainline-v5.16-rc1 commit d07b2eee category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d07b2eee4501c393cbf5bfcad36143310cfd72f9 -------------------------------------------------------------------------- Make cookie functions static as these are no longer invoked directly by other code. No functional change intended. Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210922085735.52812-1-zhangshaokun@hisilicon.comSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Li Zhijian 提交于
mainline inclusion from mainline-v5.16-rc1 commit 1c36432b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1c36432b278cecf1499f21fae19836e614954309 -------------------------------------------------------------------------- Previously, 'make -C sched run_tests' will block forever when it occurs something wrong where the *selftests framework* is waiting for its child processes to exit. [root@iaas-rpma sched]# ./cs_prctl_test ## Create a thread/process/process group hiearchy Not a core sched system tid=74985, / tgid=74985 / pgid=74985: ffffffffffffffff Not a core sched system tid=74986, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74988, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74989, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74990, / tgid=74986 / pgid=74985: ffffffffffffffff Not a core sched system tid=74987, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74991, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74992, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system tid=74993, / tgid=74987 / pgid=74985: ffffffffffffffff Not a core sched system (268) FAILED: get_cs_cookie(0) == 0 ## Set a cookie on entire process group -1 = prctl(62, 1, 0, 2, 0) core_sched create failed -- PGID: Invalid argument (cs_prctl_test.c:272) - [root@iaas-rpma sched]# ps PID TTY TIME CMD 4605 pts/2 00:00:00 bash 74986 pts/2 00:00:00 cs_prctl_test 74987 pts/2 00:00:00 cs_prctl_test 74999 pts/2 00:00:00 ps Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NLi Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NChris Hyser <chris.hyser@oracle.com> Link: https://lore.kernel.org/r/20210902024333.75983-1-lizhijian@cn.fujitsu.comSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Eugene Syromiatnikov 提交于
mainline inclusion from mainline-v5.16-rc1 commit 61bc346c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=61bc346ce64a3864ac55f5d18bdc1572cda4fb18 -------------------------------------------------------------------------- Commit 7ac592aa ("sched: prctl() core-scheduling interface") made use of enum pid_type in prctl's arg4; this type and the associated enumeration definitions are not exposed to userspace. Christian has suggested to provide additional macro definitions that convey the meaning of the type argument more in alignment with its actual usage, and this patch does exactly that. Link: https://lore.kernel.org/r/20210825170613.GA3884@asgard.redhat.comSuggested-by: NChristian Brauner <christian.brauner@ubuntu.com> Acked-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NEugene Syromiatnikov <esyr@redhat.com> Complements: 7ac592aa ("sched: prctl() core-scheduling interface") Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.16-rc1 commit bc9ffef3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bc9ffef31bf59819c9fc032178534ff9ed7c4981 -------------------------------------------------------------------------- Tao suggested a two-pass task selection to avoid the retry loop. Not only does it avoid the retry loop, it results in *much* simpler code. This also fixes an issue spotted by Josh Don where, for SMT3+, we can forget to update max on the first pass and get to do an extra round. Suggested-by: NTao Zhou <tao.zhou@linux.dev> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NJosh Don <joshdon@google.com> Reviewed-by: NVineeth Pillai (Microsoft) <vineethrp@gmail.com> Link: https://lkml.kernel.org/r/YSS9+k1teA9oPEKl@hirez.programming.kicks-ass.netSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14 commit 3c474b32 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c474b3239f12fe0b00d7e82481f36a1f31e79ab -------------------------------------------------------------------------- Eugene tripped over the case where rq_lock(), as called in a for_each_possible_cpu() loop came apart because rq->core hadn't been setup yet. This is a somewhat unusual, but valid case. Rework things such that rq->core is initialized to point at itself. IOW initialize each CPU as a single threaded Core. CPU online will then join the new CPU (thread) to an existing Core where needed. For completeness sake, have CPU offline fully undo the state so as to not presume the topology will match the next time it comes online. Fixes: 9edeaea1 ("sched: Core-wide rq->lock") Reported-by: NEugene Syromiatnikov <esyr@redhat.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NJosh Don <joshdon@google.com> Tested-by: NEugene Syromiatnikov <esyr@redhat.com> Link: https://lkml.kernel.org/r/YR473ZGeKqMs6kw+@hirez.programming.kicks-ass.net Conflicts: kernel/sched/core.c [Bugfix ed3cd45f("Merge tag 'v5.11' into sched/core, to pick up fixes & refresh the branch") is not applied.] Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Fabio M. De Francesco 提交于
mainline inclusion from mainline-v5.15-rc1 commit ce48ee81 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce48ee81a1930b2218bea23490adb6673c88bf70 -------------------------------------------------------------------------- Rephrase the "For MDS" section in core-scheduling.rst for the purpose of making it clearer what is meant by "kernel memory is still considered untrusted". Suggested-by: NVineeth Pillai <Vineeth.Pillai@microsoft.com> Signed-off-by: NFabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by: NJoel Fernandes (Google) <joelaf@google.com> Link: https://lore.kernel.org/r/20210721190250.26095-1-fmdefrancesco@gmail.comSigned-off-by: NJonathan Corbet <corbet@lwn.net> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ingo Molnar 提交于
mainline inclusion from mainline-v5.14-rc1 commit d2343cb8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d2343cb8d154fe20c4499711bb3a9af2095b2b4b -------------------------------------------------------------------------- This option at minimum adds extra code to the scheduler - even if it's default unused - and most users wouldn't want it. Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Joel Fernandes (Google) 提交于
mainline inclusion from mainline-v5.14-rc1 commit 0159bb02 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0159bb020ca9a43b17aa9149f1199643c1d49426 -------------------------------------------------------------------------- Now that core scheduling is merged, update the documentation. Co-developed-by: NChris Hyser <chris.hyser@oracle.com> Signed-off-by: NChris Hyser <chris.hyser@oracle.com> Co-developed-by: NJosh Don <joshdon@google.com> Signed-off-by: NJosh Don <joshdon@google.com> Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210603013136.370918-1-joel@joelfernandes.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 7b419f47 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7b419f47facd286c6723daca6ad69ec355473f78 -------------------------------------------------------------------------- Hugh noted that the SCHED_CORE Kconfig option could do with a help text. Requested-by: NHugh Dickins <hughd@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NHugh Dickins <hughd@google.com> Link: https://lkml.kernel.org/r/YKyhtwhEgvtUDOyl@hirez.programming.kicks-ass.netSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ingo Molnar 提交于
mainline inclusion from mainline-v5.14-rc1 commit cc00c198 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc00c1988801dc71f63bb7bad019e85046865095 -------------------------------------------------------------------------- A few more snuck in. Also capitalize 'CPU' while at it. Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Arnaldo Carvalho de Melo 提交于
mainline inclusion from mainline-v5.16-rc1 commit 49024204 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=49024204322cbfff892a28a67ad813cd41b6be81 -------------------------------------------------------------------------- To pick the changes in: 61bc346c ("uapi/linux/prctl: provide macro definitions for the PR_SCHED_CORE type argument") That don't result in any changes in tooling: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after $ Just silences this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Chris Hyser 提交于
mainline inclusion from mainline-v5.14-rc1 commit 9f269900 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f26990074931bbf797373e53104216059b300b1 -------------------------------------------------------------------------- Provides a selftest and examples of using the interface. [peterz: updated to not use sched_debug] Signed-off-by: NChris Hyser <chris.hyser@oracle.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123309.100860030@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Chris Hyser 提交于
mainline inclusion from mainline-v5.14-rc1 commit 7ac592aa category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7ac592aa35a684ff1858fb9ec282886b9e3575ac -------------------------------------------------------------------------- This patch provides support for setting and copying core scheduling 'task cookies' between threads (PID), processes (TGID), and process groups (PGID). The value of core scheduling isn't that tasks don't share a core, 'nosmt' can do that. The value lies in exploiting all the sharing opportunities that exist to recover possible lost performance and that requires a degree of flexibility in the API. From a security perspective (and there are others), the thread, process and process group distinction is an existent hierarchal categorization of tasks that reflects many of the security concerns about 'data sharing'. For example, protecting against cache-snooping by a thread that can just read the memory directly isn't all that useful. With this in mind, subcommands to CREATE/SHARE (TO/FROM) provide a mechanism to create and share cookies. CREATE/SHARE_TO specify a target pid with enum pidtype used to specify the scope of the targeted tasks. For example, PIDTYPE_TGID will share the cookie with the process and all of it's threads as typically desired in a security scenario. API: prctl(PR_SCHED_CORE, PR_SCHED_CORE_GET, tgtpid, pidtype, &cookie) prctl(PR_SCHED_CORE, PR_SCHED_CORE_CREATE, tgtpid, pidtype, NULL) prctl(PR_SCHED_CORE, PR_SCHED_CORE_SHARE_TO, tgtpid, pidtype, NULL) prctl(PR_SCHED_CORE, PR_SCHED_CORE_SHARE_FROM, srcpid, pidtype, NULL) where 'tgtpid/srcpid == 0' implies the current process and pidtype is kernel enum pid_type {PIDTYPE_PID, PIDTYPE_TGID, PIDTYPE_PGID, ...}. For return values, EINVAL, ENOMEM are what they say. ESRCH means the tgtpid/srcpid was not found. EPERM indicates lack of PTRACE permission access to tgtpid/srcpid. ENODEV indicates your machines lacks SMT. [peterz: complete rewrite] Signed-off-by: NChris Hyser <chris.hyser@oracle.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123309.039845339@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 85dd3f61 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=85dd3f61203c5cfa72b308ff327b5fbf3fc1ce5e -------------------------------------------------------------------------- Note that sched_core_fork() is called from under tasklist_lock, and not from sched_fork() earlier. This avoids a few races later. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.980003687@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 6e33cad0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6e33cad0af49336952e5541464bd02f5b5fd433e -------------------------------------------------------------------------- In order to not have to use pid_struct, create a new, smaller, structure to manage task cookies for core scheduling. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.919768100@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Aubrey Li 提交于
mainline inclusion from mainline-v5.14-rc1 commit 97886d9d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97886d9dcd86820bdbc1fa73b455982809cbc8c2 -------------------------------------------------------------------------- - Don't migrate if there is a cookie mismatch Load balance tries to move task from busiest CPU to the destination CPU. When core scheduling is enabled, if the task's cookie does not match with the destination CPU's core cookie, this task may be skipped by this CPU. This mitigates the forced idle time on the destination CPU. - Select cookie matched idle CPU In the fast path of task wakeup, select the first cookie matched idle CPU instead of the first idle CPU. - Find cookie matched idlest CPU In the slow path of task wakeup, find the idlest CPU whose core cookie matches with task's cookie Signed-off-by: NAubrey Li <aubrey.li@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.860083871@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit d2dfa17b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d2dfa17bc7de67e99685c4d6557837bf801a102c -------------------------------------------------------------------------- When a sibling is forced-idle to match the core-cookie; search for matching tasks to fill the core. rcu_read_unlock() can incur an infrequent deadlock in sched_core_balance(). Fix this by using the RCU-sched flavor instead. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.800048269@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Joel Fernandes (Google) 提交于
mainline inclusion from mainline-v5.14-rc1 commit c6047c2e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6047c2e3af68dae23ad884249e0d42ff28d2d1b -------------------------------------------------------------------------- During force-idle, we end up doing cross-cpu comparison of vruntimes during pick_next_task. If we simply compare (vruntime-min_vruntime) across CPUs, and if the CPUs only have 1 task each, we will always end up comparing 0 with 0 and pick just one of the tasks all the time. This starves the task that was not picked. To fix this, take a snapshot of the min_vruntime when entering force idle and use it for comparison. This min_vruntime snapshot will only be used for cross-CPU vruntime comparison, and nothing else. A note about the min_vruntime snapshot and force idling: During selection: When we're not fi, we need to update snapshot. when we're fi and we were not fi, we must update snapshot. When we're fi and we were already fi, we must not update snapshot. Which gives: fib fi update 0 0 1 0 1 1 1 0 1 1 1 0 Where: fi: force-idled now fib: force-idled before So the min_vruntime snapshot needs to be updated when: !(fib && fi). Also, the cfs_prio_less() function needs to be aware of whether the core is in force idle or not, since it will be use this information to know whether to advance a cfs_rq's min_vruntime_fi in the hierarchy. So pass this information along via pick_task() -> prio_less(). Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.738542617@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Joel Fernandes (Google) 提交于
mainline inclusion from mainline-v5.14-rc1 commit 7afbba11 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7afbba119f0da09824d723f8081608ea1f74ff57 -------------------------------------------------------------------------- The rationale is as follows. In the core-wide pick logic, even if need_sync == false, we need to go look at other CPUs (non-local CPUs) to see if they could be running RT. Say the RQs in a particular core look like this: Let CFS1 and CFS2 be 2 tagged CFS tags. Let RT1 be an untagged RT task. rq0 rq1 CFS1 (tagged) RT1 (no tag) CFS2 (tagged) Say schedule() runs on rq0. Now, it will enter the above loop and pick_task(RT) will return NULL for 'p'. It will enter the above if() block and see that need_sync == false and will skip RT entirely. The end result of the selection will be (say prio(CFS1) > prio(CFS2)): rq0 rq1 CFS1 IDLE When it should have selected: rq0 rq1 IDLE RT Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.678425748@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vineeth Pillai 提交于
mainline inclusion from mainline-v5.14-rc1 commit 8039e96f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8039e96fcc1de30d5bcaf05da9ca2de46a800826 -------------------------------------------------------------------------- If there is only one long running local task and the sibling is forced idle, it might not get a chance to run until a schedule event happens on any cpu in the core. So we check for this condition during a tick to see if a sibling is starved and then give it a chance to schedule. Signed-off-by: NVineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.617407840@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 539f6512 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=539f65125d20aacab54d02d77f10a839f45b09dc -------------------------------------------------------------------------- Instead of only selecting a local task, select a task for all SMT siblings for every reschedule on the core (irrespective which logical CPU does the reschedule). Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.557559654@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 8a311c74 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8a311c740b53324ec584e0e3bb7077d56b123c28 -------------------------------------------------------------------------- Introduce task_struct::core_cookie as an opaque identifier for core scheduling. When enabled; core scheduling will only allow matching task to be on the core; where idle matches everything. When task_struct::core_cookie is set (and core scheduling is enabled) these tasks are indexed in a second RB-tree, first on cookie value then on scheduling function, such that matching task selection always finds the most elegible match. NOTE: *shudder* at the overhead... NOTE: *sigh*, a 3rd copy of the scheduling function; the alternative is per class tracking of cookies and that just duplicates a lot of stuff for no raisin (the 2nd copy lives in the rt-mutex PI code). [Joel: folded fixes] Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.496975854@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 21f56ffe category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=21f56ffe4482e501b9e83737612493eeaac21f5a -------------------------------------------------------------------------- Because sched_class::pick_next_task() also implies sched_class::set_next_task() (and possibly put_prev_task() and newidle_balance) it is not state invariant. This makes it unsuitable for remote task selection. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> [Vineeth: folded fixes] Signed-off-by: NVineeth Remanan Pillai <viremana@linux.microsoft.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.437092775@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 875feb41 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=875feb41fd20f6bd6054c9e79a5bcd9da6d8d2b2 -------------------------------------------------------------------------- Stuff the meat of sched_core_put() into a work such that we can use sched_core_put() from atomic context. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.377455632@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 9ef7e7e3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9ef7e7e33bcdb57be1afb28884053c28b5f05240 -------------------------------------------------------------------------- rq_lockp() includes a static_branch(), which is asm-goto, which is asm volatile which defeats regular CSE. This means that: if (!static_branch(&foo)) return simple; if (static_branch(&foo) && cond) return complex; Doesn't fold and we get horrible code. Introduce __rq_lockp() without the static_branch() on. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.316696988@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 9edeaea1 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9edeaea1bc452372718837ed2ba775811baf1ba1 -------------------------------------------------------------------------- Introduce the basic infrastructure to have a core wide rq->lock. This relies on the rq->__lock order being in increasing CPU number (inside a core). It is also constrained to SMT8 per lockdep (and SMT256 per preempt_count). Luckily SMT8 is the max supported SMT count for Linux (Mips, Sparc and Power are known to have this). Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/YJUNfzSgptjX7tG6@hirez.programming.kicks-ass.netSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit d66f1b06 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d66f1b06b5b438cd20ba3664b8eef1f9c79e84bf -------------------------------------------------------------------------- When switching on core-sched, CPUs need to agree which lock to use for their RQ. The new rule will be that rq->core_enabled will be toggled while holding all rq->__locks that belong to a core. This means we need to double check the rq->core_enabled value after each lock acquire and retry if it changed. This also has implications for those sites that take multiple RQ locks, they need to be careful that the second lock doesn't end up being the first lock. Verify the lock pointer after acquiring the first lock, because if they're on the same core, holding any of the rq->__lock instances will pin the core state. While there, change the rq->__lock order to CPU number, instead of rq address, this greatly simplifies the next patch. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/YJUNY0dmrJMD/BIm@hirez.programming.kicks-ass.netSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 5cb9eaa3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5cb9eaa3d274f75539077a28cf01e3563195fa53 -------------------------------------------------------------------------- In preparation of playing games with rq->lock, abstract the thing using an accessor. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.136465446@infradead.org Conflicts: kernel/sched/core.c [Bugfix a7c81556("sched: Fix migrate_disable() vs rt/dl balancing") is not applied. Bugfix 565790d2("sched: Fix balance_callback()") is not applied. Bugfix ae792702("sched: Optimize finish_lock_switch()") is not applied. Bugfix 36c6e17b("sched/core: Print out straggler tasks in sched_cpu_dying()") is not applied. Feature 2558aacf("sched/hotplug: Ensure only per-cpu kthreads run during hotplug") is not applied. Feature f2469a1f("sched/core: Wait for tasks being pushed away on hotplug") is not applied.] kernel/sched/deadline.c [Bugfix a7c81556("sched: Fix migrate_disable() vs rt/dl balancing") is not applied.] kernel/sched/fair.c [Feature acf66d70("sched/fair: Provide can_migrate_task_llc") Feature 0826530d("sched/fair: Remove update of blocked load from newidle_balance") s not applied. Feature 6864cf01("sched/fair: Steal work from an overloaded CPU when CPU goes idle")] kernel/sched/rt.c [Bugfix a7c81556("sched: Fix migrate_disable() vs rt/dl balancing") is not applied.] kernel/sched/sched.h [[Bugfix a7c81556("sched: Fix migrate_disable() vs rt/dl balancing") is not applied.] Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 39d371b7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=39d371b7c0c299d489041884d005aacc4bba8c15 -------------------------------------------------------------------------- In prepration for playing games with rq->lock, add some rq_lock wrappers. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.075967879@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.14-rc1 commit 9099a147 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9099a14708ce1dfecb6002605594a0daa319b555 -------------------------------------------------------------------------- Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NDon Hiatt <dhiatt@digitalocean.com> Tested-by: NHongyu Ning <hongyu.ning@linux.intel.com> Tested-by: NVincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.015639083@infradead.orgSigned-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.12-rc1 commit 2d24dd57 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5OOWG CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d24dd5798d04 -------------------------------------------------------------------------- I've always been bothered by the endless (fragile) boilerplate for rbtree, and I recently wrote some rbtree helpers for objtool and figured I should lift them into the kernel and use them more widely. Provide: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two Inlining and constant propagation should see the compiler inline the whole thing, including the various compare functions. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NMichel Lespinasse <walken@google.com> Acked-by: NDavidlohr Bueso <dbueso@suse.de> Conflicts: tools/objtool/elf.c [Feature 3690914e("objtool: Extract elf_symbol_add()")] Signed-off-by: NLin Shengwang <linshengwang1@huawei.com> Reviewed-by: Nlihua <hucool.lihua@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 29 9月, 2022 6 次提交
-
-
由 Keqian Zhu 提交于
mainline inclusion from mainline-v5.14-rc1 commit 2aa53d68 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5R1MW CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2aa53d68cee6 ------------------------------------------------------------------ The MMIO region of a device maybe huge (GB level), try to use block mapping in stage2 to speedup both map and unmap. Compared to normal memory mapping, we should consider two more points when try block mapping for MMIO region: 1. For normal memory mapping, the PA(host physical address) and HVA have same alignment within PUD_SIZE or PMD_SIZE when we use the HVA to request hugepage, so we don't need to consider PA alignment when verifing block mapping. But for device memory mapping, the PA and HVA may have different alignment. 2. For normal memory mapping, we are sure hugepage size properly fit into vma, so we don't check whether the mapping size exceeds the boundary of vma. But for device memory mapping, we should pay attention to this. This adds get_vma_page_shift() to get page shift for both normal memory and device MMIO region, and check these two points when selecting block mapping size for MMIO region. Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NHeng Zhang <zhangheng191@h-partners.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Link: https://lore.kernel.org/r/20210507110322.23348-3-zhukeqian1@huawei.comSigned-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Keqian Zhu 提交于
mainline inclusion from mainline-v5.14-rc1 commit fd6f17ba category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5R1MW CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fd6f17bade21 --------------------------------------------------------------------- The MMIO regions may be unmapped for many reasons and can be remapped by stage2 fault path. Map MMIO regions at creation time becomes a minor optimization and makes these two mapping path hard to sync. Remove the mapping code while keep the useful sanity check. Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NHeng Zhang <zhangheng191@h-partners.com> Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com> Link: https://lore.kernel.org/r/20210507110322.23348-2-zhukeqian1@huawei.comSigned-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Baokun Li 提交于
hulk inclusion category: bugfix bugzilla: 187600, https://gitee.com/openeuler/kernel/issues/I5SV2U CVE: NA -------------------------------- If the starting position of our insert range happens to be in the hole between the two ext4_extent_idx, because the lblk of the ext4_extent in the previous ext4_extent_idx is always less than the start, which leads to the "extent" variable access across the boundary, the following UAF is triggered: ================================================================== BUG: KASAN: use-after-free in ext4_ext_shift_extents+0x257/0x790 Read of size 4 at addr ffff88819807a008 by task fallocate/8010 CPU: 3 PID: 8010 Comm: fallocate Tainted: G E 5.10.0+ #492 Call Trace: dump_stack+0x7d/0xa3 print_address_description.constprop.0+0x1e/0x220 kasan_report.cold+0x67/0x7f ext4_ext_shift_extents+0x257/0x790 ext4_insert_range+0x5b6/0x700 ext4_fallocate+0x39e/0x3d0 vfs_fallocate+0x26f/0x470 ksys_fallocate+0x3a/0x70 __x64_sys_fallocate+0x4f/0x60 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ================================================================== For right shifts, we can divide them into the following situations: 1. When the first ee_block of ext4_extent_idx is greater than or equal to start, make right shifts directly from the first ee_block. 1) If it is greater than start, we need to continue searching in the previous ext4_extent_idx. 2) If it is equal to start, we can exit the loop (iterator=NULL). 2. When the first ee_block of ext4_extent_idx is less than start, then traverse from the last extent to find the first extent whose ee_block is less than start. 1) If extent is still the last extent after traversal, it means that the last ee_block of ext4_extent_idx is less than start, that is, start is located in the hole between idx and (idx+1), so we can exit the loop directly (break) without right shifts. 2) Otherwise, make right shifts at the corresponding position of the found extent, and then exit the loop (iterator=NULL). Fixes: 331573fe ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate") Cc: stable@vger.kernel.org Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Stephen Rothwell 提交于
mainline inclusion from mainline-remotes/origin/next commit 366317ea category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5RP8T CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git/commit/?id=366317eae983a0d96aeed78ad219b9c4ed2a719a -------------------------------------------------------------------------- drivers/hwtracing/ptt/hisi_ptt.c:13:10: fatal error: linux/dma-iommu.h: No such file or directory 13 | #include <linux/dma-iommu.h> | ^~~~~~~~~~~~~~~~~~~ Caused by: commit ff0de066 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device") interacting with: commit f2042ed2 ("iommu/dma: Make header private") from the iommu tree. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Acked-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NYicong Yang <yangyicong@hisilicon.com> [Fixed subject line and added changelog text] Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NWangming Shao <shaowangming@h-partners.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NJay Fang <f.fangjian@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yicong Yang 提交于
mainline inclusion from mainline-remotes/origin/next commit 366317ea category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5RP8T CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git/commit/?id=366317eae983a0d96aeed78ad219b9c4ed2a719a -------------------------------------------------------------------------- Add maintainer for driver and documentation of HiSilicon PTT device. Signed-off-by: NYicong Yang <yangyicong@hisilicon.com> Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220816114414.4092-6-yangyicong@huawei.comSigned-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NWangming Shao <shaowangming@h-partners.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NJay Fang <f.fangjian@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yicong Yang 提交于
mainline inclusion from mainline-remotes/origin/next commit a7112b74 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5RP8T CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git/commit/?id=a7112b747c324dda8937d4f47b14dc0af0b465d1 -------------------------------------------------------------------------- Document the introduction and usage of HiSilicon PTT device driver as well as the sysfs attributes description provided by the driver. Signed-off-by: NYicong Yang <yangyicong@hisilicon.com> Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: NBagas Sanjaya <bagasdotme@gmail.com> [Fixed month and kernel version] Link: https://lore.kernel.org/r/20220816114414.4092-5-yangyicong@huawei.comSigned-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: NWangming Shao <shaowangming@h-partners.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NJay Fang <f.fangjian@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-