1. 09 10月, 2013 1 次提交
  2. 27 2月, 2013 1 次提交
    • T
      stop_machine: Mark per cpu stopper enabled early · 46c498c2
      Thomas Gleixner 提交于
      commit 14e568e7 (stop_machine: Use smpboot threads) introduced the
      following regression:
      
      Before this commit the stopper enabled bit was set in the online
      notifier.
      
      CPU0				CPU1
      cpu_up
      				cpu online
      hotplug_notifier(ONLINE)
        stopper(CPU1)->enabled = true;
      ...
      stop_machine()
      
      The conversion to smpboot threads moved the enablement to the wakeup
      path of the parked thread. The majority of users seem to have the
      following working order:
      
      CPU0				CPU1
      cpu_up
      				cpu online
      unpark_threads()
        wakeup(stopper[CPU1])
      ....
      				stopper thread runs
      				  stopper(CPU1)->enabled = true;
      stop_machine()
      
      But Konrad and Sander have observed:
      
      CPU0				CPU1
      cpu_up
      				cpu online
      unpark_threads()
        wakeup(stopper[CPU1])
      ....
      stop_machine()
      				stopper thread runs
      				  stopper(CPU1)->enabled = true;
      
      Now the stop machinery kicks CPU0 into the stop loop, where it gets
      stuck forever because the queue code saw stopper(CPU1)->enabled ==
      false, so CPU0 waits for CPU1 to enter stomp_machine, but the CPU1
      stopper work got discarded due to enabled == false.
      
      Add a pre_unpark function to the smpboot thread descriptor and call it
      before waking the thread.
      
      This fixes the problem at hand, but the stop_machine code should be
      more robust. The stopper->enabled flag smells fishy at best.
      
      Thanks to Konrad for going through a loop of debug patches and
      providing the information to decode this issue.
      Reported-and-tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reported-and-tested-by: NSander Eikelenboom <linux@eikelenboom.it>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1302261843240.22263@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      46c498c2
  3. 14 2月, 2013 2 次提交
  4. 01 11月, 2011 1 次提交
  5. 31 10月, 2011 1 次提交
  6. 26 10月, 2011 1 次提交
  7. 27 7月, 2011 1 次提交
  8. 28 6月, 2011 4 次提交
  9. 23 3月, 2011 1 次提交
  10. 27 10月, 2010 2 次提交
  11. 19 10月, 2010 1 次提交
    • P
      sched: Create special class for stop/migrate work · 34f971f6
      Peter Zijlstra 提交于
      In order to separate the stop/migrate work thread from the SCHED_FIFO
      implementation, create a special class for it that is of higher priority than
      SCHED_FIFO itself.
      
      This currently solves a problem where cpu-hotplug consumes so much cpu-time
      that the SCHED_FIFO class gets throttled, but has the bandwidth replenishment
      timer pending on the now dead cpu.
      
      It is also required for when we add the planned deadline scheduling class above
      SCHED_FIFO, as the stop/migrate thread still needs to transcent those tasks.
      Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1285165776.2275.1022.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      34f971f6
  12. 10 8月, 2010 1 次提交
  13. 31 5月, 2010 1 次提交
    • A
      sched: Make sure timers have migrated before killing the migration_thread · 54e88fad
      Amit K. Arora 提交于
      Problem: In a stress test where some heavy tests were running along with
      regular CPU offlining and onlining, a hang was observed. The system seems
      to be hung at a point where migration_call() tries to kill the
      migration_thread of the dying CPU, which just got moved to the current
      CPU. This migration thread does not get a chance to run (and die) since
      rt_throttled is set to 1 on current, and it doesn't get cleared as the
      hrtimer which is supposed to reset the rt bandwidth
      (sched_rt_period_timer) is tied to the CPU which we just marked dead!
      
      Solution: This patch pushes the killing of migration thread to
      "CPU_POST_DEAD" event. By then all the timers (including
      sched_rt_period_timer) should have got migrated (along with other
      callbacks).
      Signed-off-by: NAmit Arora <aarora@in.ibm.com>
      Signed-off-by: NGautham R Shenoy <ego@in.ibm.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <20100525132346.GA14986@amitarora.in.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      54e88fad
  14. 18 5月, 2010 1 次提交
  15. 08 5月, 2010 1 次提交
    • T
      cpu_stop: add dummy implementation for UP · bbf1bb3e
      Tejun Heo 提交于
      When !CONFIG_SMP, cpu_stop functions weren't defined at all which
      could lead to build failures if UP code uses cpu_stop facility.  Add
      dummy cpu_stop implementation for UP.  The waiting variants execute
      the work function directly with preempt disabled and
      stop_one_cpu_nowait() schedules a workqueue work.
      
      Makefile and ifdefs around stop_machine implementation are updated to
      accomodate CONFIG_SMP && !CONFIG_STOP_MACHINE case.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      bbf1bb3e
  16. 07 5月, 2010 3 次提交
    • T
      sched: replace migration_thread with cpu_stop · 969c7921
      Tejun Heo 提交于
      Currently migration_thread is serving three purposes - migration
      pusher, context to execute active_load_balance() and forced context
      switcher for expedited RCU synchronize_sched.  All three roles are
      hardcoded into migration_thread() and determining which job is
      scheduled is slightly messy.
      
      This patch kills migration_thread and replaces all three uses with
      cpu_stop.  The three different roles of migration_thread() are
      splitted into three separate cpu_stop callbacks -
      migration_cpu_stop(), active_load_balance_cpu_stop() and
      synchronize_sched_expedited_cpu_stop() - and each use case now simply
      asks cpu_stop to execute the callback as necessary.
      
      synchronize_sched_expedited() was implemented with private
      preallocated resources and custom multi-cpu queueing and waiting
      logic, both of which are provided by cpu_stop.
      synchronize_sched_expedited_count is made atomic and all other shared
      resources along with the mutex are dropped.
      
      synchronize_sched_expedited() also implemented a check to detect cases
      where not all the callback got executed on their assigned cpus and
      fall back to synchronize_sched().  If called with cpu hotplug blocked,
      cpu_stop already guarantees that and the condition cannot happen;
      otherwise, stop_machine() would break.  However, this patch preserves
      the paranoid check using a cpumask to record on which cpus the stopper
      ran so that it can serve as a bisection point if something actually
      goes wrong theree.
      
      Because the internal execution state is no longer visible,
      rcu_expedited_torture_stats() is removed.
      
      This patch also renames cpu_stop threads to from "stopper/%d" to
      "migration/%d".  The names of these threads ultimately don't matter
      and there's no reason to make unnecessary userland visible changes.
      
      With this patch applied, stop_machine() and sched now share the same
      resources.  stop_machine() is faster without wasting any resources and
      sched migration users are much cleaner.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Josh Triplett <josh@freedesktop.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      969c7921
    • T
      stop_machine: reimplement using cpu_stop · 3fc1f1e2
      Tejun Heo 提交于
      Reimplement stop_machine using cpu_stop.  As cpu stoppers are
      guaranteed to be available for all online cpus,
      stop_machine_create/destroy() are no longer necessary and removed.
      
      With resource management and synchronization handled by cpu_stop, the
      new implementation is much simpler.  Asking the cpu_stop to execute
      the stop_cpu() state machine on all online cpus with cpu hotplug
      disabled is enough.
      
      stop_machine itself doesn't need to manage any global resources
      anymore, so all per-instance information is rolled into struct
      stop_machine_data and the mutex and all static data variables are
      removed.
      
      The previous implementation created and destroyed RT workqueues as
      necessary which made stop_machine() calls highly expensive on very
      large machines.  According to Dimitri Sivanich, preventing the dynamic
      creation/destruction makes booting faster more than twice on very
      large machines.  cpu_stop resources are preallocated for all online
      cpus and should have the same effect.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      3fc1f1e2
    • T
      cpu_stop: implement stop_cpu[s]() · 1142d810
      Tejun Heo 提交于
      Implement a simplistic per-cpu maximum priority cpu monopolization
      mechanism.  A non-sleeping callback can be scheduled to run on one or
      multiple cpus with maximum priority monopolozing those cpus.  This is
      primarily to replace and unify RT workqueue usage in stop_machine and
      scheduler migration_thread which currently is serving multiple
      purposes.
      
      Four functions are provided - stop_one_cpu(), stop_one_cpu_nowait(),
      stop_cpus() and try_stop_cpus().
      
      This is to allow clean sharing of resources among stop_cpu and all the
      migration thread users.  One stopper thread per cpu is created which
      is currently named "stopper/CPU".  This will eventually replace the
      migration thread and take on its name.
      
      * This facility was originally named cpuhog and lived in separate
        files but Peter Zijlstra nacked the name and thus got renamed to
        cpu_stop and moved into stop_machine.c.
      
      * Better reporting of preemption leak as per Peter's suggestion.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      1142d810
  17. 17 2月, 2010 1 次提交
  18. 30 3月, 2009 1 次提交
  19. 20 2月, 2009 1 次提交
  20. 05 1月, 2009 1 次提交
    • H
      stop_machine: introduce stop_machine_create/destroy. · 9ea09af3
      Heiko Carstens 提交于
      Introduce stop_machine_create/destroy. With this interface subsystems
      that need a non-failing stop_machine environment can create the
      stop_machine machine threads before actually calling stop_machine.
      When the threads aren't needed anymore they can be killed with
      stop_machine_destroy again.
      
      When stop_machine gets called and the threads aren't present they
      will be created and destroyed automatically. This restores the old
      behaviour of stop_machine.
      
      This patch also converts cpu hotplug to the new interface since it
      is special: cpu_down calls __stop_machine instead of stop_machine.
      However the kstop threads will only be created when stop_machine
      gets called.
      
      Changing the code so that the threads would be created automatically
      on __stop_machine is currently not possible: when __stop_machine gets
      called we hold cpu_add_remove_lock, which is the same lock that
      create_rt_workqueue would take. So the workqueue needs to be created
      before the cpu hotplug code locks cpu_add_remove_lock.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9ea09af3
  21. 01 1月, 2009 1 次提交
  22. 17 11月, 2008 1 次提交
  23. 26 10月, 2008 1 次提交
    • L
      Revert "Call init_workqueues before pre smp initcalls." · 4403b406
      Linus Torvalds 提交于
      This reverts commit a802dd0e by moving
      the call to init_workqueues() back where it belongs - after SMP has been
      initialized.
      
      It also moves stop_machine_init() - which needs workqueues - to a later
      phase using a core_initcall() instead of early_initcall().  That should
      satisfy all ordering requirements, and was apparently the reason why
      init_workqueues() was moved to be too early.
      
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4403b406
  24. 22 10月, 2008 2 次提交
  25. 12 8月, 2008 1 次提交
  26. 28 7月, 2008 3 次提交
  27. 26 7月, 2008 1 次提交
  28. 19 7月, 2008 1 次提交
    • M
      cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr · 65c01184
      Mike Travis 提交于
        * This patch replaces the dangerous lvalue version of cpumask_of_cpu
          with new cpumask_of_cpu_ptr macros.  These are patterned after the
          node_to_cpumask_ptr macros.
      
          In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
          the cpumask_of_cpu_map[cpu] entry is used.  The cpumask_of_cpu_map
          is provided when there is a large NR_CPUS count, reducing
          greatly the amount of code generated and stack space used for
          cpumask_of_cpu().  The pointer to the cpumask_t value is needed for
          calling set_cpus_allowed_ptr() to reduce the amount of stack space
          needed to pass the cpumask_t value.
      
          If there isn't a cpumask_of_cpu_map[], then a temporary variable is
          declared and filled in with value from cpumask_of_cpu(cpu) as well as
          a pointer variable pointing to this temporary variable.  Afterwards,
          the pointer is used to reference the cpumask value.  The compiler
          will optimize out the extra dereference through the pointer as well
          as the stack space used for the pointer, resulting in identical code.
      
          A good example of the orthogonal usages is in net/sunrpc/svc.c:
      
      	case SVC_POOL_PERCPU:
      	{
      		unsigned int cpu = m->pool_to[pidx];
      		cpumask_of_cpu_ptr(cpumask, cpu);
      
      		*oldmask = current->cpus_allowed;
      		set_cpus_allowed_ptr(current, cpumask);
      		return 1;
      	}
      	case SVC_POOL_PERNODE:
      	{
      		unsigned int node = m->pool_to[pidx];
      		node_to_cpumask_ptr(nodecpumask, node);
      
      		*oldmask = current->cpus_allowed;
      		set_cpus_allowed_ptr(current, nodecpumask);
      		return 1;
      	}
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      65c01184
  29. 24 6月, 2008 1 次提交
    • R
      sched: add new API sched_setscheduler_nocheck: add a flag to control access checks · 961ccddd
      Rusty Russell 提交于
      Hidehiro Kawai noticed that sched_setscheduler() can fail in
      stop_machine: it calls sched_setscheduler() from insmod, which can
      have CAP_SYS_MODULE without CAP_SYS_NICE.
      
      Two cases could have failed, so are changed to sched_setscheduler_nocheck:
        kernel/softirq.c:cpu_callback()
      	- CPU hotplug callback
        kernel/stop_machine.c:__stop_machine_run()
      	- Called from various places, including modprobe()
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-mm@kvack.org
      Cc: sugita <yumiko.sugita.yf@hitachi.com>
      Cc: Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      961ccddd
  30. 23 5月, 2008 1 次提交
    • C
      stop_machine: make stop_machine_run more virtualization friendly · 3401a61e
      Christian Borntraeger 提交于
      On kvm I have seen some rare hangs in stop_machine when I used more guest
      cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the
      hang quite often. I could also reproduce the problem on a 4 way z/VM host with
      a 64 way guest.
      
      It turned out that the guest was consuming all available cpus mostly for
      spinning on scheduler locks like rq->lock. This is expected as the threads are
      calling yield all the time.
      The problem is now, that the host scheduling decisings together with the guest
      scheduling decisions and spinlocks not being fair managed to create an
      interesting scenario similar to a live lock. (Sometimes the hang resolved
      itself after some minutes)
      
      Changing stop_machine to yield the cpu to the hypervisor when yielding inside
      the guest fixed the problem for me. While I am not completely happy with this
      patch, I think it causes no harm and it really improves the situation for me.
      
      I used cpu_relax for yielding to the hypervisor, does that work on all
      architectures?
      
      p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use
      stop_machine_run and both triggered the problem after some retries.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      CC: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      3401a61e