提交 7eacf348 编写于 作者: P Peter Zijlstra 提交者: Yang Yingliang

cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order

stable inclusion
from linux-4.19.106
commit b9dc4d61b5c2d8ea289087f57898426017431391

--------------------------------

[ Upstream commit 45178ac0 ]

Paul reported a very sporadic, rcutorture induced, workqueue failure.
When the planets align, the workqueue rescuer's self-migrate fails and
then triggers a WARN for running a work on the wrong CPU.

Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call
could be ignored! When stopper->enabled is false, stop_machine will
insta complete the work, without actually doing the work. Worse, it
will not WARN about this (we really should fix this).

It turns out there is a small window where a freshly online'ed CPU is
marked 'online' but doesn't yet have the stopper task running:

	BP				AP

	bringup_cpu()
	  __cpu_up(cpu, idle)	 -->	start_secondary()
					...
					cpu_startup_entry()
	  bringup_wait_for_ap()
	    wait_for_ap_thread() <--	  cpuhp_online_idle()
					  while (1)
					    do_idle()

					... available to run kthreads ...

	    stop_machine_unpark()
	      stopper->enable = true;

Close this by moving the stop_machine_unpark() into
cpuhp_online_idle(), such that the stopper thread is ready before we
start the idle loop and schedule.
Reported-by: N"Paul E. McKenney" <paulmck@kernel.org>
Debugged-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: N"Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
上级 736f8b47
...@@ -493,8 +493,7 @@ static int bringup_wait_for_ap(unsigned int cpu) ...@@ -493,8 +493,7 @@ static int bringup_wait_for_ap(unsigned int cpu)
if (WARN_ON_ONCE((!cpu_online(cpu)))) if (WARN_ON_ONCE((!cpu_online(cpu))))
return -ECANCELED; return -ECANCELED;
/* Unpark the stopper thread and the hotplug thread of the target cpu */ /* Unpark the hotplug thread of the target cpu */
stop_machine_unpark(cpu);
kthread_unpark(st->thread); kthread_unpark(st->thread);
/* /*
...@@ -1050,8 +1049,8 @@ void notify_cpu_starting(unsigned int cpu) ...@@ -1050,8 +1049,8 @@ void notify_cpu_starting(unsigned int cpu)
/* /*
* Called from the idle task. Wake up the controlling task which brings the * Called from the idle task. Wake up the controlling task which brings the
* stopper and the hotplug thread of the upcoming CPU up and then delegates * hotplug thread of the upcoming CPU up and then delegates the rest of the
* the rest of the online bringup to the hotplug thread. * online bringup to the hotplug thread.
*/ */
void cpuhp_online_idle(enum cpuhp_state state) void cpuhp_online_idle(enum cpuhp_state state)
{ {
...@@ -1061,6 +1060,12 @@ void cpuhp_online_idle(enum cpuhp_state state) ...@@ -1061,6 +1060,12 @@ void cpuhp_online_idle(enum cpuhp_state state)
if (state != CPUHP_AP_ONLINE_IDLE) if (state != CPUHP_AP_ONLINE_IDLE)
return; return;
/*
* Unpart the stopper thread before we start the idle loop (and start
* scheduling); this ensures the stopper task is always available.
*/
stop_machine_unpark(smp_processor_id());
st->state = CPUHP_AP_ONLINE_IDLE; st->state = CPUHP_AP_ONLINE_IDLE;
complete_ap_thread(st, true); complete_ap_thread(st, true);
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册