1. 28 6月, 2011 1 次提交
    • S
      x86, mtrr: lock stop machine during MTRR rendezvous sequence · 6d3321e8
      Suresh Siddha 提交于
      MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
      happen in parallel with another system wide rendezvous using
      stop_machine(). This can lead to deadlock (The order in which
      works are queued can be different on different cpu's. Some cpu's
      will be running the first rendezvous handler and others will be running
      the second rendezvous handler. Each set waiting for the other set to join
      for the system wide rendezvous, leading to a deadlock).
      
      MTRR rendezvous sequence is not implemented using stop_machine() as this
      gets called both from the process context aswell as the cpu online paths
      (where the cpu has not come online and the interrupts are disabled etc).
      stop_machine() works with only online cpus.
      
      For now, take the stop_machine mutex in the MTRR rendezvous sequence that
      gets called from an online cpu (here we are in the process context
      and can potentially sleep while taking the mutex). And the MTRR rendezvous
      that gets triggered during cpu online doesn't need to take this stop_machine
      lock (as the stop_machine() already ensures that there is no cpu hotplug
      going on in parallel by doing get_online_cpus())
      
          TBD: Pursue a cleaner solution of extending the stop_machine()
               infrastructure to handle the case where the calling cpu is
               still not online and use this for MTRR rendezvous sequence.
      
      fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008Reported-by: NVadim Kotelnikov <vadimuzzz@inbox.ru>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com
      Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      6d3321e8
  2. 23 3月, 2011 1 次提交
  3. 27 10月, 2010 2 次提交
  4. 19 10月, 2010 1 次提交
    • P
      sched: Create special class for stop/migrate work · 34f971f6
      Peter Zijlstra 提交于
      In order to separate the stop/migrate work thread from the SCHED_FIFO
      implementation, create a special class for it that is of higher priority than
      SCHED_FIFO itself.
      
      This currently solves a problem where cpu-hotplug consumes so much cpu-time
      that the SCHED_FIFO class gets throttled, but has the bandwidth replenishment
      timer pending on the now dead cpu.
      
      It is also required for when we add the planned deadline scheduling class above
      SCHED_FIFO, as the stop/migrate thread still needs to transcent those tasks.
      Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1285165776.2275.1022.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      34f971f6
  5. 10 8月, 2010 1 次提交
  6. 31 5月, 2010 1 次提交
    • A
      sched: Make sure timers have migrated before killing the migration_thread · 54e88fad
      Amit K. Arora 提交于
      Problem: In a stress test where some heavy tests were running along with
      regular CPU offlining and onlining, a hang was observed. The system seems
      to be hung at a point where migration_call() tries to kill the
      migration_thread of the dying CPU, which just got moved to the current
      CPU. This migration thread does not get a chance to run (and die) since
      rt_throttled is set to 1 on current, and it doesn't get cleared as the
      hrtimer which is supposed to reset the rt bandwidth
      (sched_rt_period_timer) is tied to the CPU which we just marked dead!
      
      Solution: This patch pushes the killing of migration thread to
      "CPU_POST_DEAD" event. By then all the timers (including
      sched_rt_period_timer) should have got migrated (along with other
      callbacks).
      Signed-off-by: NAmit Arora <aarora@in.ibm.com>
      Signed-off-by: NGautham R Shenoy <ego@in.ibm.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <20100525132346.GA14986@amitarora.in.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      54e88fad
  7. 18 5月, 2010 1 次提交
  8. 08 5月, 2010 1 次提交
    • T
      cpu_stop: add dummy implementation for UP · bbf1bb3e
      Tejun Heo 提交于
      When !CONFIG_SMP, cpu_stop functions weren't defined at all which
      could lead to build failures if UP code uses cpu_stop facility.  Add
      dummy cpu_stop implementation for UP.  The waiting variants execute
      the work function directly with preempt disabled and
      stop_one_cpu_nowait() schedules a workqueue work.
      
      Makefile and ifdefs around stop_machine implementation are updated to
      accomodate CONFIG_SMP && !CONFIG_STOP_MACHINE case.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      bbf1bb3e
  9. 07 5月, 2010 3 次提交
    • T
      sched: replace migration_thread with cpu_stop · 969c7921
      Tejun Heo 提交于
      Currently migration_thread is serving three purposes - migration
      pusher, context to execute active_load_balance() and forced context
      switcher for expedited RCU synchronize_sched.  All three roles are
      hardcoded into migration_thread() and determining which job is
      scheduled is slightly messy.
      
      This patch kills migration_thread and replaces all three uses with
      cpu_stop.  The three different roles of migration_thread() are
      splitted into three separate cpu_stop callbacks -
      migration_cpu_stop(), active_load_balance_cpu_stop() and
      synchronize_sched_expedited_cpu_stop() - and each use case now simply
      asks cpu_stop to execute the callback as necessary.
      
      synchronize_sched_expedited() was implemented with private
      preallocated resources and custom multi-cpu queueing and waiting
      logic, both of which are provided by cpu_stop.
      synchronize_sched_expedited_count is made atomic and all other shared
      resources along with the mutex are dropped.
      
      synchronize_sched_expedited() also implemented a check to detect cases
      where not all the callback got executed on their assigned cpus and
      fall back to synchronize_sched().  If called with cpu hotplug blocked,
      cpu_stop already guarantees that and the condition cannot happen;
      otherwise, stop_machine() would break.  However, this patch preserves
      the paranoid check using a cpumask to record on which cpus the stopper
      ran so that it can serve as a bisection point if something actually
      goes wrong theree.
      
      Because the internal execution state is no longer visible,
      rcu_expedited_torture_stats() is removed.
      
      This patch also renames cpu_stop threads to from "stopper/%d" to
      "migration/%d".  The names of these threads ultimately don't matter
      and there's no reason to make unnecessary userland visible changes.
      
      With this patch applied, stop_machine() and sched now share the same
      resources.  stop_machine() is faster without wasting any resources and
      sched migration users are much cleaner.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Josh Triplett <josh@freedesktop.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      969c7921
    • T
      stop_machine: reimplement using cpu_stop · 3fc1f1e2
      Tejun Heo 提交于
      Reimplement stop_machine using cpu_stop.  As cpu stoppers are
      guaranteed to be available for all online cpus,
      stop_machine_create/destroy() are no longer necessary and removed.
      
      With resource management and synchronization handled by cpu_stop, the
      new implementation is much simpler.  Asking the cpu_stop to execute
      the stop_cpu() state machine on all online cpus with cpu hotplug
      disabled is enough.
      
      stop_machine itself doesn't need to manage any global resources
      anymore, so all per-instance information is rolled into struct
      stop_machine_data and the mutex and all static data variables are
      removed.
      
      The previous implementation created and destroyed RT workqueues as
      necessary which made stop_machine() calls highly expensive on very
      large machines.  According to Dimitri Sivanich, preventing the dynamic
      creation/destruction makes booting faster more than twice on very
      large machines.  cpu_stop resources are preallocated for all online
      cpus and should have the same effect.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      3fc1f1e2
    • T
      cpu_stop: implement stop_cpu[s]() · 1142d810
      Tejun Heo 提交于
      Implement a simplistic per-cpu maximum priority cpu monopolization
      mechanism.  A non-sleeping callback can be scheduled to run on one or
      multiple cpus with maximum priority monopolozing those cpus.  This is
      primarily to replace and unify RT workqueue usage in stop_machine and
      scheduler migration_thread which currently is serving multiple
      purposes.
      
      Four functions are provided - stop_one_cpu(), stop_one_cpu_nowait(),
      stop_cpus() and try_stop_cpus().
      
      This is to allow clean sharing of resources among stop_cpu and all the
      migration thread users.  One stopper thread per cpu is created which
      is currently named "stopper/CPU".  This will eventually replace the
      migration thread and take on its name.
      
      * This facility was originally named cpuhog and lived in separate
        files but Peter Zijlstra nacked the name and thus got renamed to
        cpu_stop and moved into stop_machine.c.
      
      * Better reporting of preemption leak as per Peter's suggestion.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      1142d810
  10. 17 2月, 2010 1 次提交
  11. 30 3月, 2009 1 次提交
  12. 20 2月, 2009 1 次提交
  13. 05 1月, 2009 1 次提交
    • H
      stop_machine: introduce stop_machine_create/destroy. · 9ea09af3
      Heiko Carstens 提交于
      Introduce stop_machine_create/destroy. With this interface subsystems
      that need a non-failing stop_machine environment can create the
      stop_machine machine threads before actually calling stop_machine.
      When the threads aren't needed anymore they can be killed with
      stop_machine_destroy again.
      
      When stop_machine gets called and the threads aren't present they
      will be created and destroyed automatically. This restores the old
      behaviour of stop_machine.
      
      This patch also converts cpu hotplug to the new interface since it
      is special: cpu_down calls __stop_machine instead of stop_machine.
      However the kstop threads will only be created when stop_machine
      gets called.
      
      Changing the code so that the threads would be created automatically
      on __stop_machine is currently not possible: when __stop_machine gets
      called we hold cpu_add_remove_lock, which is the same lock that
      create_rt_workqueue would take. So the workqueue needs to be created
      before the cpu hotplug code locks cpu_add_remove_lock.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9ea09af3
  14. 01 1月, 2009 1 次提交
  15. 17 11月, 2008 1 次提交
  16. 26 10月, 2008 1 次提交
    • L
      Revert "Call init_workqueues before pre smp initcalls." · 4403b406
      Linus Torvalds 提交于
      This reverts commit a802dd0e by moving
      the call to init_workqueues() back where it belongs - after SMP has been
      initialized.
      
      It also moves stop_machine_init() - which needs workqueues - to a later
      phase using a core_initcall() instead of early_initcall().  That should
      satisfy all ordering requirements, and was apparently the reason why
      init_workqueues() was moved to be too early.
      
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4403b406
  17. 22 10月, 2008 2 次提交
  18. 12 8月, 2008 1 次提交
  19. 28 7月, 2008 3 次提交
  20. 26 7月, 2008 1 次提交
  21. 19 7月, 2008 1 次提交
    • M
      cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr · 65c01184
      Mike Travis 提交于
        * This patch replaces the dangerous lvalue version of cpumask_of_cpu
          with new cpumask_of_cpu_ptr macros.  These are patterned after the
          node_to_cpumask_ptr macros.
      
          In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
          the cpumask_of_cpu_map[cpu] entry is used.  The cpumask_of_cpu_map
          is provided when there is a large NR_CPUS count, reducing
          greatly the amount of code generated and stack space used for
          cpumask_of_cpu().  The pointer to the cpumask_t value is needed for
          calling set_cpus_allowed_ptr() to reduce the amount of stack space
          needed to pass the cpumask_t value.
      
          If there isn't a cpumask_of_cpu_map[], then a temporary variable is
          declared and filled in with value from cpumask_of_cpu(cpu) as well as
          a pointer variable pointing to this temporary variable.  Afterwards,
          the pointer is used to reference the cpumask value.  The compiler
          will optimize out the extra dereference through the pointer as well
          as the stack space used for the pointer, resulting in identical code.
      
          A good example of the orthogonal usages is in net/sunrpc/svc.c:
      
      	case SVC_POOL_PERCPU:
      	{
      		unsigned int cpu = m->pool_to[pidx];
      		cpumask_of_cpu_ptr(cpumask, cpu);
      
      		*oldmask = current->cpus_allowed;
      		set_cpus_allowed_ptr(current, cpumask);
      		return 1;
      	}
      	case SVC_POOL_PERNODE:
      	{
      		unsigned int node = m->pool_to[pidx];
      		node_to_cpumask_ptr(nodecpumask, node);
      
      		*oldmask = current->cpus_allowed;
      		set_cpus_allowed_ptr(current, nodecpumask);
      		return 1;
      	}
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      65c01184
  22. 24 6月, 2008 1 次提交
    • R
      sched: add new API sched_setscheduler_nocheck: add a flag to control access checks · 961ccddd
      Rusty Russell 提交于
      Hidehiro Kawai noticed that sched_setscheduler() can fail in
      stop_machine: it calls sched_setscheduler() from insmod, which can
      have CAP_SYS_MODULE without CAP_SYS_NICE.
      
      Two cases could have failed, so are changed to sched_setscheduler_nocheck:
        kernel/softirq.c:cpu_callback()
      	- CPU hotplug callback
        kernel/stop_machine.c:__stop_machine_run()
      	- Called from various places, including modprobe()
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-mm@kvack.org
      Cc: sugita <yumiko.sugita.yf@hitachi.com>
      Cc: Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      961ccddd
  23. 23 5月, 2008 1 次提交
    • C
      stop_machine: make stop_machine_run more virtualization friendly · 3401a61e
      Christian Borntraeger 提交于
      On kvm I have seen some rare hangs in stop_machine when I used more guest
      cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the
      hang quite often. I could also reproduce the problem on a 4 way z/VM host with
      a 64 way guest.
      
      It turned out that the guest was consuming all available cpus mostly for
      spinning on scheduler locks like rq->lock. This is expected as the threads are
      calling yield all the time.
      The problem is now, that the host scheduling decisings together with the guest
      scheduling decisions and spinlocks not being fair managed to create an
      interesting scenario similar to a live lock. (Sometimes the hang resolved
      itself after some minutes)
      
      Changing stop_machine to yield the cpu to the hypervisor when yielding inside
      the guest fixed the problem for me. While I am not completely happy with this
      patch, I think it causes no harm and it really improves the situation for me.
      
      I used cpu_relax for yielding to the hypervisor, does that work on all
      architectures?
      
      p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use
      stop_machine_run and both triggered the problem after some retries.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      CC: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      3401a61e
  24. 22 4月, 2008 1 次提交
  25. 20 4月, 2008 1 次提交
    • M
      generic: use new set_cpus_allowed_ptr function · f70316da
      Mike Travis 提交于
        * Use new set_cpus_allowed_ptr() function added by previous patch,
          which instead of passing the "newly allowed cpus" cpumask_t arg
          by value,  pass it by pointer:
      
          -int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
          +int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
      
        * Modify CPU_MASK_ALL
      
      Depends on:
      	[sched-devel]: sched: add new set_cpus_allowed_ptr function
      Signed-off-by: NMike Travis <travis@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f70316da
  26. 19 4月, 2008 1 次提交
  27. 07 2月, 2008 1 次提交
  28. 26 1月, 2008 1 次提交
    • G
      cpu-hotplug: replace lock_cpu_hotplug() with get_online_cpus() · 86ef5c9a
      Gautham R Shenoy 提交于
      Replace all lock_cpu_hotplug/unlock_cpu_hotplug from the kernel and use
      get_online_cpus and put_online_cpus instead as it highlights the
      refcount semantics in these operations.
      
      The new API guarantees protection against the cpu-hotplug operation, but
      it doesn't guarantee serialized access to any of the local data
      structures. Hence the changes needs to be reviewed.
      
      In case of pseries_add_processor/pseries_remove_processor, use
      cpu_maps_update_begin()/cpu_maps_update_done() as we're modifying the
      cpu_present_map there.
      Signed-off-by: NGautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      86ef5c9a
  29. 17 7月, 2007 1 次提交
  30. 11 5月, 2007 1 次提交
  31. 09 5月, 2007 1 次提交
    • P
      Use stop_machine_run in the Intel RNG driver · ee527cd3
      Prarit Bhargava 提交于
      Replace call_smp_function with stop_machine_run in the Intel RNG driver.
      
      CPU A has done read_lock(&lock)
      CPU B has done write_lock_irq(&lock) and is waiting for A to release the lock.
      
      A third CPU calls call_smp_function and issues the IPI.  CPU A takes CPU
      C's IPI.  CPU B is waiting with interrupts disabled and does not see the
      IPI.  CPU C is stuck waiting for CPU B to respond to the IPI.
      
      Deadlock.
      
      The solution is to use stop_machine_run instead of call_smp_function
      (call_smp_function should not be called in situations where the CPUs may be
      suspended).
      
      [haruo.tomita@toshiba.co.jp: fix a typo in mod_init()]
      [haruo.tomita@toshiba.co.jp: fix memory leak]
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: "Tomita, Haruo" <haruo.tomita@toshiba.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee527cd3
  32. 30 9月, 2006 1 次提交
  33. 28 8月, 2006 1 次提交
  34. 04 7月, 2006 1 次提交