1. 20 3月, 2014 2 次提交
    • S
      CPU hotplug: Provide lockless versions of callback registration functions · 93ae4f97
      Srivatsa S. Bhat 提交于
      The following method of CPU hotplug callback registration is not safe
      due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
      and the cpu_hotplug.lock.
      
      	get_online_cpus();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	register_cpu_notifier(&foobar_cpu_notifier);
      
      	put_online_cpus();
      
      The deadlock is shown below:
      
                CPU 0                                         CPU 1
                -----                                         -----
      
         Acquire cpu_hotplug.lock
         [via get_online_cpus()]
      
                                                    CPU online/offline operation
                                                    takes cpu_add_remove_lock
                                                    [via cpu_maps_update_begin()]
      
         Try to acquire
         cpu_add_remove_lock
         [via register_cpu_notifier()]
      
                                                    CPU online/offline operation
                                                    tries to acquire cpu_hotplug.lock
                                                    [via cpu_hotplug_begin()]
      
                                  *** DEADLOCK! ***
      
      The problem here is that callback registration takes the locks in one order
      whereas the CPU hotplug operations take the same locks in the opposite order.
      To avoid this issue and to provide a race-free method to register CPU hotplug
      callbacks (along with initialization of already online CPUs), introduce new
      variants of the callback registration APIs that simply register the callbacks
      without holding the cpu_add_remove_lock during the registration. That way,
      we can avoid the ABBA scenario. However, we will need to hold the
      cpu_add_remove_lock throughout the entire critical section, to protect updates
      to the callback/notifier chain.
      
      This can be achieved by writing the callback registration code as follows:
      
      	cpu_maps_update_begin(); [ or cpu_notifier_register_begin(); see below ]
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	/* This doesn't take the cpu_add_remove_lock */
      	__register_cpu_notifier(&foobar_cpu_notifier);
      
      	cpu_maps_update_done();  [ or cpu_notifier_register_done(); see below ]
      
      Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin()
      because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD
      notifiers, and hence get_online_cpus() cannot provide the necessary
      synchronization to protect the callback/notifier chains against concurrent
      reads and writes. On the other hand, since the cpu_add_remove_lock protects
      the entire hotplug operation (including CPU_POST_DEAD), we can use
      cpu_maps_update_begin/done() to guarantee proper synchronization.
      
      Also, since cpu_maps_update_begin/done() is like a super-set of
      get/put_online_cpus(), the former naturally protects the critical sections
      from concurrent hotplug operations.
      
      Since the names cpu_maps_update_begin/done() don't make much sense in CPU
      hotplug callback registration scenarios, we'll introduce new APIs named
      cpu_notifier_register_begin/done() and map them to cpu_maps_update_begin/done().
      
      In summary, introduce the lockless variants of un/register_cpu_notifier() and
      also export the cpu_notifier_register_begin/done() APIs for use by modules.
      This way, we provide a race-free way to register hotplug callbacks as well as
      perform initialization for the CPUs that are already online.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NToshi Kani <toshi.kani@hp.com>
      Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      93ae4f97
    • G
      CPU hotplug: Add lockdep annotations to get/put_online_cpus() · a19423b9
      Gautham R. Shenoy 提交于
      Add lockdep annotations for get/put_online_cpus() and
      cpu_hotplug_begin()/cpu_hotplug_end().
      
      Cc: Ingo Molnar <mingo@kernel.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a19423b9
  2. 13 11月, 2013 2 次提交
  3. 16 10月, 2013 1 次提交
    • P
      sched: Remove get_online_cpus() usage · 6acce3ef
      Peter Zijlstra 提交于
      Remove get_online_cpus() usage from the scheduler; there's 4 sites that
      use it:
      
       - sched_init_smp(); where its completely superfluous since we're in
         'early' boot and there simply cannot be any hotplugging.
      
       - sched_getaffinity(); we already take a raw spinlock to protect the
         task cpus_allowed mask, this disables preemption and therefore
         also stabilizes cpu_online_mask as that's modified using
         stop_machine. However switch to active mask for symmetry with
         sched_setaffinity()/set_cpus_allowed_ptr(). We guarantee active
         mask stability by inserting sync_rcu/sched() into _cpu_down.
      
       - sched_setaffinity(); we don't appear to need get_online_cpus()
         either, there's two sites where hotplug appears relevant:
          * cpuset_cpus_allowed(); for the !cpuset case we use possible_mask,
            for the cpuset case we hold task_lock, which is a spinlock and
            thus for mainline disables preemption (might cause pain on RT).
          * set_cpus_allowed_ptr(); Holds all scheduler locks and thus has
            preemption properly disabled; also it already deals with hotplug
            races explicitly where it releases them.
      
       - migrate_swap(); we can make stop_two_cpus() do the heavy lifting for
         us with a little trickery. By adding a sync_sched/rcu() after the
         CPU_DOWN_PREPARE notifier we can provide preempt/rcu guarantees for
         cpu_active_mask. Use these to validate that both our cpus are active
         when queueing the stop work before we queue the stop_machine works
         for take_cpu_down().
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/20131011123820.GV3081@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6acce3ef
  4. 13 8月, 2013 1 次提交
    • T
      ACPI / processor: Acquire writer lock to update CPU maps · b9d10be7
      Toshi Kani 提交于
      CPU system maps are protected with reader/writer locks.  The reader
      lock, get_online_cpus(), assures that the maps are not updated while
      holding the lock.  The writer lock, cpu_hotplug_begin(), is used to
      udpate the cpu maps along with cpu_maps_update_begin().
      
      However, the ACPI processor handler updates the cpu maps without
      holding the the writer lock.
      
      acpi_map_lsapic() is called from acpi_processor_hotadd_init() to
      update cpu_possible_mask and cpu_present_mask.  acpi_unmap_lsapic()
      is called from acpi_processor_remove() to update cpu_possible_mask.
      Currently, they are either unprotected or protected with the reader
      lock, which is not correct.
      
      For example, the get_online_cpus() below is supposed to assure that
      cpu_possible_mask is not changed while the code is iterating with
      for_each_possible_cpu().
      
              get_online_cpus();
              for_each_possible_cpu(cpu) {
      		:
              }
              put_online_cpus();
      
      However, this lock has no protection with CPU hotplug since the ACPI
      processor handler does not use the writer lock when it updates
      cpu_possible_mask.  The reader lock does not serialize within the
      readers.
      
      This patch protects them with the writer lock with cpu_hotplug_begin()
      along with cpu_maps_update_begin(), which must be held before calling
      cpu_hotplug_begin().  It also protects arch_register_cpu() /
      arch_unregister_cpu(), which creates / deletes a sysfs cpu device
      interface.  For this purpose it changes cpu_hotplug_begin() and
      cpu_hotplug_done() to global and exports them in cpu.h.
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b9d10be7
  5. 15 7月, 2013 1 次提交
    • P
      kernel: delete __cpuinit usage from all core kernel files · 0db0628d
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the uses of the __cpuinit macros from C files in
      the core kernel directories (kernel, init, lib, mm, and include)
      that don't really have a specific maintainer.
      
      [1] https://lkml.org/lkml/2013/5/20/589Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      0db0628d
  6. 13 6月, 2013 1 次提交
  7. 14 2月, 2013 1 次提交
  8. 28 1月, 2013 1 次提交
    • F
      cputime: Use accessors to read task cputime stats · 6fac4829
      Frederic Weisbecker 提交于
      This is in preparation for the full dynticks feature. While
      remotely reading the cputime of a task running in a full
      dynticks CPU, we'll need to do some extra-computation. This
      way we can account the time it spent tickless in userspace
      since its last cputime snapshot.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      6fac4829
  9. 15 11月, 2012 2 次提交
  10. 09 10月, 2012 1 次提交
  11. 23 8月, 2012 1 次提交
    • R
      x86/smp: Don't ever patch back to UP if we unplug cpus · 816afe4f
      Rusty Russell 提交于
      We still patch SMP instructions to UP variants if we boot with a
      single CPU, but not at any other time.  In particular, not if we
      unplug CPUs to return to a single cpu.
      
      Paul McKenney points out:
      
       mean offline overhead is 6251/48=130.2 milliseconds.
      
       If I remove the alternatives_smp_switch() from the offline
       path [...] the mean offline overhead is 550/42=13.1 milliseconds
      
      Basically, we're never going to get those 120ms back, and the
      code is pretty messy.
      
      We get rid of:
      
       1) The "smp-alt-once" boot option. It's actually "smp-alt-boot", the
          documentation is wrong. It's now the default.
      
       2) The skip_smp_alternatives flag used by suspend.
      
       3) arch_disable_nonboot_cpus_begin() and arch_disable_nonboot_cpus_end()
          which were only used to set this one flag.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Paul McKenney <paul.mckenney@us.ibm.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/87vcgwwive.fsf@rustcorp.com.auSigned-off-by: NIngo Molnar <mingo@kernel.org>
      816afe4f
  12. 13 8月, 2012 1 次提交
  13. 01 8月, 2012 1 次提交
    • J
      mm/hotplug: correctly setup fallback zonelists when creating new pgdat · 9adb62a5
      Jiang Liu 提交于
      When hotadd_new_pgdat() is called to create new pgdat for a new node, a
      fallback zonelist should be created for the new node.  There's code to try
      to achieve that in hotadd_new_pgdat() as below:
      
      	/*
      	 * The node we allocated has no zone fallback lists. For avoiding
      	 * to access not-initialized zonelist, build here.
      	 */
      	mutex_lock(&zonelists_mutex);
      	build_all_zonelists(pgdat, NULL);
      	mutex_unlock(&zonelists_mutex);
      
      But it doesn't work as expected.  When hotadd_new_pgdat() is called, the
      new node is still in offline state because node_set_online(nid) hasn't
      been called yet.  And build_all_zonelists() only builds zonelists for
      online nodes as:
      
              for_each_online_node(nid) {
                      pg_data_t *pgdat = NODE_DATA(nid);
      
                      build_zonelists(pgdat);
                      build_zonelist_cache(pgdat);
              }
      
      Though we hope to create zonelist for the new pgdat, but it doesn't.  So
      add a new parameter "pgdat" the build_all_zonelists() to build pgdat for
      the new pgdat too.
      Signed-off-by: NJiang Liu <liuj97@gmail.com>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Keping Chen <chenkeping@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9adb62a5
  14. 01 6月, 2012 2 次提交
    • A
      kernel/cpu.c: document clear_tasks_mm_cpumask() · e4cc2f87
      Anton Vorontsov 提交于
      Add more comments on clear_tasks_mm_cpumask, plus adds a runtime check:
      the function is only suitable for offlined CPUs, and if called
      inappropriately, the kernel should scream aloud.
      
      [akpm@linux-foundation.org: tweak comment: s/walks up/walks/, use 80 cols]
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e4cc2f87
    • A
      cpu: introduce clear_tasks_mm_cpumask() helper · cb79295e
      Anton Vorontsov 提交于
      Many architectures clear tasks' mm_cpumask like this:
      
      	read_lock(&tasklist_lock);
      	for_each_process(p) {
      		if (p->mm)
      			cpumask_clear_cpu(cpu, mm_cpumask(p->mm));
      	}
      	read_unlock(&tasklist_lock);
      
      Depending on the context, the code above may have several problems,
      such as:
      
      1. Working with task->mm w/o getting mm or grabing the task lock is
         dangerous as ->mm might disappear (exit_mm() assigns NULL under
         task_lock(), so tasklist lock is not enough).
      
      2. Checking for process->mm is not enough because process' main
         thread may exit or detach its mm via use_mm(), but other threads
         may still have a valid mm.
      
      This patch implements a small helper function that does things
      correctly, i.e.:
      
      1. We take the task's lock while whe handle its mm (we can't use
         get_task_mm()/mmput() pair as mmput() might sleep);
      
      2. To catch exited main thread case, we use find_lock_task_mm(),
         which walks up all threads and returns an appropriate task
         (with task lock held).
      
      Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in
      the new helper, instead we take the rcu read lock. We can do this
      because the function is called after the cpu is taken down and marked
      offline, so no new tasks will get this cpu set in their mm mask.
      Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb79295e
  15. 04 5月, 2012 1 次提交
  16. 26 4月, 2012 3 次提交
    • T
      smp: Provide generic idle thread allocation · 29d5e047
      Thomas Gleixner 提交于
      All SMP architectures have magic to fork the idle task and to store it
      for reusage when cpu hotplug is enabled. Provide a generic
      infrastructure for it.
      
      Create/reinit the idle thread for the cpu which is brought up in the
      generic code and hand the thread pointer to the architecture code via
      __cpu_up().
      
      Note, that fork_idle() is called via a workqueue, because this
      guarantees that the idle thread does not get a reference to a user
      space VM. This can happen when the boot process did not bring up all
      possible cpus and a later cpu_up() is initiated via the sysfs
      interface. In that case fork_idle() would be called in the context of
      the user space task and take a reference on the user space VM.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: x86@kernel.org
      Acked-by: NVenkatesh Pallipadi <venki@google.com>
      Link: http://lkml.kernel.org/r/20120420124557.102478630@linutronix.de
      29d5e047
    • T
      smp: Add generic smpboot facility · 38498a67
      Thomas Gleixner 提交于
      Start a new file, which will hold SMP and CPU hotplug related generic
      infrastructure.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/20120420124557.035417523@linutronix.de
      38498a67
    • T
      smp: Add task_struct argument to __cpu_up() · 8239c25f
      Thomas Gleixner 提交于
      Preparatory patch to make the idle thread allocation for secondary
      cpus generic.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: x86@kernel.org
      Link: http://lkml.kernel.org/r/20120420124556.964170564@linutronix.de
      8239c25f
  17. 15 12月, 2011 1 次提交
  18. 13 12月, 2011 1 次提交
  19. 24 11月, 2011 1 次提交
  20. 05 11月, 2011 1 次提交
    • S
      PM / Sleep: Fix race between CPU hotplug and freezer · 79cfbdfa
      Srivatsa S. Bhat 提交于
      The CPU hotplug notifications sent out by the _cpu_up() and _cpu_down()
      functions depend on the value of the 'tasks_frozen' argument passed to them
      (which indicates whether tasks have been frozen or not).
      (Examples for such CPU hotplug notifications: CPU_ONLINE, CPU_ONLINE_FROZEN,
      CPU_DEAD, CPU_DEAD_FROZEN).
      
      Thus, it is essential that while the callbacks for those notifications are
      running, the state of the system with respect to the tasks being frozen or
      not remains unchanged, *throughout that duration*. Hence there is a need for
      synchronizing the CPU hotplug code with the freezer subsystem.
      
      Since the freezer is involved only in the Suspend/Hibernate call paths, this
      patch hooks the CPU hotplug code to the suspend/hibernate notifiers
      PM_[SUSPEND|HIBERNATE]_PREPARE and PM_POST_[SUSPEND|HIBERNATE] to prevent
      the race between CPU hotplug and freezer, thus ensuring that CPU hotplug
      notifications will always be run with the state of the system really being
      what the notifications indicate, _throughout_ their execution time.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      79cfbdfa
  21. 31 10月, 2011 1 次提交
  22. 31 3月, 2011 1 次提交
  23. 23 3月, 2011 1 次提交
  24. 14 12月, 2010 1 次提交
  25. 23 11月, 2010 2 次提交
  26. 18 11月, 2010 1 次提交
    • P
      sched: Simplify cpu-hot-unplug task migration · 48c5ccae
      Peter Zijlstra 提交于
      While discussing the need for sched_idle_next(), Oleg remarked that
      since try_to_wake_up() ensures sleeping tasks will end up running on a
      sane cpu, we can do away with migrate_live_tasks().
      
      If we then extend the existing hack of migrating current from
      CPU_DYING to migrating the full rq worth of tasks from CPU_DYING, the
      need for the sched_idle_next() abomination disappears as well, since
      idle will be the only possible thread left after the migration thread
      stops.
      
      This greatly simplifies the hot-unplug task migration path, as can be
      seen from the resulting code reduction (and about half the new lines
      are comments).
      Suggested-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1289851597.2109.547.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      48c5ccae
  27. 09 6月, 2010 1 次提交
    • T
      sched: adjust when cpu_active and cpuset configurations are updated during cpu on/offlining · 3a101d05
      Tejun Heo 提交于
      Currently, when a cpu goes down, cpu_active is cleared before
      CPU_DOWN_PREPARE starts and cpuset configuration is updated from a
      default priority cpu notifier.  When a cpu is coming up, it's set
      before CPU_ONLINE but cpuset configuration again is updated from the
      same cpu notifier.
      
      For cpu notifiers, this presents an inconsistent state.  Threads which
      a CPU_DOWN_PREPARE notifier expects to be bound to the CPU can be
      migrated to other cpus because the cpu is no more inactive.
      
      Fix it by updating cpu_active in the highest priority cpu notifier and
      cpuset configuration in the second highest when a cpu is coming up.
      Down path is updated similarly.  This guarantees that all other cpu
      notifiers see consistent cpu_active and cpuset configuration.
      
      cpuset_track_online_cpus() notifier is converted to
      cpuset_update_active_cpus() which just updates the configuration and
      now called from cpuset_cpu_[in]active() notifiers registered from
      sched_init_smp().  If cpuset is disabled, cpuset_update_active_cpus()
      degenerates into partition_sched_domains() making separate notifier
      for !CONFIG_CPUSETS unnecessary.
      
      This problem is triggered by cmwq.  During CPU_DOWN_PREPARE, hotplug
      callback creates a kthread and kthread_bind()s it to the target cpu,
      and the thread is expected to run on that cpu.
      
      * Ingo's test discovered __cpuinit/exit markups were incorrect.
        Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Menage <menage@google.com>
      3a101d05
  28. 02 6月, 2010 1 次提交
  29. 31 5月, 2010 1 次提交
  30. 28 5月, 2010 4 次提交