1. 08 11月, 2014 1 次提交
    • S
      drivers: base: add cpu_device_create to support per-cpu devices · 3d52943b
      Sudeep Holla 提交于
      This patch adds a new function to create per-cpu devices.
      This helps in:
      1. reusing the device infrastructure to create any cpu related
         attributes and corresponding sysfs instead of creating and
         dealing with raw kobjects directly
      2. retaining the legacy path(/sys/devices/system/cpu/..) to support
         existing sysfs ABI
      3. avoiding to create links in the bus directory pointing to the
         device as there would be per-cpu instance of these devices with
         the same name since dev->bus is not populated to cpu_sysbus on
         purpose
      Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
      Tested-by: NStephen Boyd <sboyd@codeaurora.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: David Herrmann <dh.herrmann@gmail.com>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d52943b
  2. 19 9月, 2014 1 次提交
    • P
      rcu: Eliminate deadlock between CPU hotplug and expedited grace periods · dd56af42
      Paul E. McKenney 提交于
      Currently, the expedited grace-period primitives do get_online_cpus().
      This greatly simplifies their implementation, but means that calls
      to them holding locks that are acquired by CPU-hotplug notifiers (to
      say nothing of calls to these primitives from CPU-hotplug notifiers)
      can deadlock.  But this is starting to become inconvenient, as can be
      seen here: https://lkml.org/lkml/2014/8/5/754.  The problem in this
      case is that some developers need to acquire a mutex from a CPU-hotplug
      notifier, but also need to hold it across a synchronize_rcu_expedited().
      As noted above, this currently results in deadlock.
      
      This commit avoids the deadlock and retains the simplicity by creating
      a try_get_online_cpus(), which returns false if the get_online_cpus()
      reference count could not immediately be incremented.  If a call to
      try_get_online_cpus() returns true, the expedited primitives operate as
      before.  If a call returns false, the expedited primitives fall back to
      normal grace-period operations.  This falling back of course results in
      increased grace-period latency, but only during times when CPU hotplug
      operations are actually in flight.  The effect should therefore be
      negligible during normal operation.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Tested-by: NLan Tianyu <tianyu.lan@intel.com>
      dd56af42
  3. 07 6月, 2014 1 次提交
  4. 20 3月, 2014 1 次提交
    • S
      CPU hotplug: Provide lockless versions of callback registration functions · 93ae4f97
      Srivatsa S. Bhat 提交于
      The following method of CPU hotplug callback registration is not safe
      due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
      and the cpu_hotplug.lock.
      
      	get_online_cpus();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	register_cpu_notifier(&foobar_cpu_notifier);
      
      	put_online_cpus();
      
      The deadlock is shown below:
      
                CPU 0                                         CPU 1
                -----                                         -----
      
         Acquire cpu_hotplug.lock
         [via get_online_cpus()]
      
                                                    CPU online/offline operation
                                                    takes cpu_add_remove_lock
                                                    [via cpu_maps_update_begin()]
      
         Try to acquire
         cpu_add_remove_lock
         [via register_cpu_notifier()]
      
                                                    CPU online/offline operation
                                                    tries to acquire cpu_hotplug.lock
                                                    [via cpu_hotplug_begin()]
      
                                  *** DEADLOCK! ***
      
      The problem here is that callback registration takes the locks in one order
      whereas the CPU hotplug operations take the same locks in the opposite order.
      To avoid this issue and to provide a race-free method to register CPU hotplug
      callbacks (along with initialization of already online CPUs), introduce new
      variants of the callback registration APIs that simply register the callbacks
      without holding the cpu_add_remove_lock during the registration. That way,
      we can avoid the ABBA scenario. However, we will need to hold the
      cpu_add_remove_lock throughout the entire critical section, to protect updates
      to the callback/notifier chain.
      
      This can be achieved by writing the callback registration code as follows:
      
      	cpu_maps_update_begin(); [ or cpu_notifier_register_begin(); see below ]
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	/* This doesn't take the cpu_add_remove_lock */
      	__register_cpu_notifier(&foobar_cpu_notifier);
      
      	cpu_maps_update_done();  [ or cpu_notifier_register_done(); see below ]
      
      Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin()
      because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD
      notifiers, and hence get_online_cpus() cannot provide the necessary
      synchronization to protect the callback/notifier chains against concurrent
      reads and writes. On the other hand, since the cpu_add_remove_lock protects
      the entire hotplug operation (including CPU_POST_DEAD), we can use
      cpu_maps_update_begin/done() to guarantee proper synchronization.
      
      Also, since cpu_maps_update_begin/done() is like a super-set of
      get/put_online_cpus(), the former naturally protects the critical sections
      from concurrent hotplug operations.
      
      Since the names cpu_maps_update_begin/done() don't make much sense in CPU
      hotplug callback registration scenarios, we'll introduce new APIs named
      cpu_notifier_register_begin/done() and map them to cpu_maps_update_begin/done().
      
      In summary, introduce the lockless variants of un/register_cpu_notifier() and
      also export the cpu_notifier_register_begin/done() APIs for use by modules.
      This way, we provide a race-free way to register hotplug callbacks as well as
      perform initialization for the CPUs that are already online.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NToshi Kani <toshi.kani@hp.com>
      Reviewed-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      93ae4f97
  5. 19 2月, 2014 1 次提交
  6. 16 10月, 2013 1 次提交
  7. 01 10月, 2013 1 次提交
    • T
      hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock() · 6dedcca6
      Toshi Kani 提交于
      cpu_hotplug_driver_lock() serializes CPU online/offline operations
      when ARCH_CPU_PROBE_RELEASE is set.  This lock interface is no longer
      necessary with the following reason:
      
       - lock_device_hotplug() now protects CPU online/offline operations,
         including the probe & release interfaces enabled by
         ARCH_CPU_PROBE_RELEASE.  The use of cpu_hotplug_driver_lock() is
         redundant.
       - cpu_hotplug_driver_lock() is only valid when ARCH_CPU_PROBE_RELEASE
         is defined, which is misleading and is only enabled on powerpc.
      
      This patch removes the cpu_hotplug_driver_lock() interface.  As
      a result, ARCH_CPU_PROBE_RELEASE only enables / disables the cpu
      probe & release interface as intended.  There is no functional change
      in this patch.
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Reviewed-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6dedcca6
  8. 21 8月, 2013 1 次提交
    • S
      of: move of_get_cpu_node implementation to DT core library · 183912d3
      Sudeep KarkadaNagesha 提交于
      This patch moves the generalized implementation of of_get_cpu_node from
      PowerPC to DT core library, thereby adding support for retrieving cpu
      node for a given logical cpu index on any architecture.
      
      The CPU subsystem can now use this function to assign of_node in the
      cpu device while registering CPUs.
      
      It is recommended to use these helper function only in pre-SMP/early
      initialisation stages to retrieve CPU device node pointers in logical
      ordering. Once the cpu devices are registered, it can be retrieved easily
      from cpu device of_node which avoids unnecessary parsing and matching.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Grant Likely <grant.likely@linaro.org>
      Acked-by: NRob Herring <rob.herring@calxeda.com>
      Signed-off-by: NSudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
      183912d3
  9. 13 8月, 2013 1 次提交
    • T
      ACPI / processor: Acquire writer lock to update CPU maps · b9d10be7
      Toshi Kani 提交于
      CPU system maps are protected with reader/writer locks.  The reader
      lock, get_online_cpus(), assures that the maps are not updated while
      holding the lock.  The writer lock, cpu_hotplug_begin(), is used to
      udpate the cpu maps along with cpu_maps_update_begin().
      
      However, the ACPI processor handler updates the cpu maps without
      holding the the writer lock.
      
      acpi_map_lsapic() is called from acpi_processor_hotadd_init() to
      update cpu_possible_mask and cpu_present_mask.  acpi_unmap_lsapic()
      is called from acpi_processor_remove() to update cpu_possible_mask.
      Currently, they are either unprotected or protected with the reader
      lock, which is not correct.
      
      For example, the get_online_cpus() below is supposed to assure that
      cpu_possible_mask is not changed while the code is iterating with
      for_each_possible_cpu().
      
              get_online_cpus();
              for_each_possible_cpu(cpu) {
      		:
              }
              put_online_cpus();
      
      However, this lock has no protection with CPU hotplug since the ACPI
      processor handler does not use the writer lock when it updates
      cpu_possible_mask.  The reader lock does not serialize within the
      readers.
      
      This patch protects them with the writer lock with cpu_hotplug_begin()
      along with cpu_maps_update_begin(), which must be held before calling
      cpu_hotplug_begin().  It also protects arch_register_cpu() /
      arch_unregister_cpu(), which creates / deletes a sysfs cpu device
      interface.  For this purpose it changes cpu_hotplug_begin() and
      cpu_hotplug_done() to global and exports them in cpu.h.
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b9d10be7
  10. 15 7月, 2013 1 次提交
    • P
      kernel: delete __cpuinit usage from all core kernel files · 0db0628d
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the uses of the __cpuinit macros from C files in
      the core kernel directories (kernel, init, lib, mm, and include)
      that don't really have a specific maintainer.
      
      [1] https://lkml.org/lkml/2013/5/20/589Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      0db0628d
  11. 13 6月, 2013 1 次提交
  12. 28 5月, 2013 1 次提交
  13. 08 4月, 2013 2 次提交
  14. 18 7月, 2012 1 次提交
    • T
      workqueue: perform cpu down operations from low priority cpu_notifier() · 65758202
      Tejun Heo 提交于
      Currently, all workqueue cpu hotplug operations run off
      CPU_PRI_WORKQUEUE which is higher than normal notifiers.  This is to
      ensure that workqueue is up and running while bringing up a CPU before
      other notifiers try to use workqueue on the CPU.
      
      Per-cpu workqueues are supposed to remain working and bound to the CPU
      for normal CPU_DOWN_PREPARE notifiers.  This holds mostly true even
      with workqueue offlining running with higher priority because
      workqueue CPU_DOWN_PREPARE only creates a bound trustee thread which
      runs the per-cpu workqueue without concurrency management without
      explicitly detaching the existing workers.
      
      However, if the trustee needs to create new workers, it creates
      unbound workers which may wander off to other CPUs while
      CPU_DOWN_PREPARE notifiers are in progress.  Furthermore, if the CPU
      down is cancelled, the per-CPU workqueue may end up with workers which
      aren't bound to the CPU.
      
      While reliably reproducible with a convoluted artificial test-case
      involving scheduling and flushing CPU burning work items from CPU down
      notifiers, this isn't very likely to happen in the wild, and, even
      when it happens, the effects are likely to be hidden by the following
      successful CPU down.
      
      Fix it by using different priorities for up and down notifiers - high
      priority for up operations and low priority for down operations.
      
      Workqueue cpu hotplug operations will soon go through further cleanup.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      65758202
  15. 01 6月, 2012 1 次提交
    • A
      cpu: introduce clear_tasks_mm_cpumask() helper · cb79295e
      Anton Vorontsov 提交于
      Many architectures clear tasks' mm_cpumask like this:
      
      	read_lock(&tasklist_lock);
      	for_each_process(p) {
      		if (p->mm)
      			cpumask_clear_cpu(cpu, mm_cpumask(p->mm));
      	}
      	read_unlock(&tasklist_lock);
      
      Depending on the context, the code above may have several problems,
      such as:
      
      1. Working with task->mm w/o getting mm or grabing the task lock is
         dangerous as ->mm might disappear (exit_mm() assigns NULL under
         task_lock(), so tasklist lock is not enough).
      
      2. Checking for process->mm is not enough because process' main
         thread may exit or detach its mm via use_mm(), but other threads
         may still have a valid mm.
      
      This patch implements a small helper function that does things
      correctly, i.e.:
      
      1. We take the task's lock while whe handle its mm (we can't use
         get_task_mm()/mmput() pair as mmput() might sleep);
      
      2. To catch exited main thread case, we use find_lock_task_mm(),
         which walks up all threads and returns an appropriate task
         (with task lock held).
      
      Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in
      the new helper, instead we take the rcu read lock. We can do this
      because the function is called after the cpu is taken down and marked
      offline, so no new tasks will get this cpu set in their mm mask.
      Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb79295e
  16. 17 5月, 2012 1 次提交
    • P
      sched: Remove stale power aware scheduling remnants and dysfunctional knobs · 8e7fbcbc
      Peter Zijlstra 提交于
      It's been broken forever (i.e. it's not scheduling in a power
      aware fashion), as reported by Suresh and others sending
      patches, and nobody cares enough to fix it properly ...
      so remove it to make space free for something better.
      
      There's various problems with the code as it stands today, first
      and foremost the user interface which is bound to topology
      levels and has multiple values per level. This results in a
      state explosion which the administrator or distro needs to
      master and almost nobody does.
      
      Furthermore large configuration state spaces aren't good, it
      means the thing doesn't just work right because it's either
      under so many impossibe to meet constraints, or even if
      there's an achievable state workloads have to be aware of
      it precisely and can never meet it for dynamic workloads.
      
      So pushing this kind of decision to user-space was a bad idea
      even with a single knob - it's exponentially worse with knobs
      on every node of the topology.
      
      There is a proposal to replace the user interface with a single
      3 state knob:
      
       sched_balance_policy := { performance, power, auto }
      
      where 'auto' would be the preferred default which looks at things
      like Battery/AC mode and possible cpufreq state or whatever the hw
      exposes to show us power use expectations - but there's been no
      progress on it in the past many months.
      
      Aside from that, the actual implementation of the various knobs
      is known to be broken. There have been sporadic attempts at
      fixing things but these always stop short of reaching a mergable
      state.
      
      Therefore this wholesale removal with the hopes of spurring
      people who care to come forward once again and work on a
      coherent replacement.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1326104915.2442.53.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8e7fbcbc
  17. 16 3月, 2012 1 次提交
    • P
      device.h: audit and cleanup users in main include dir · 313162d0
      Paul Gortmaker 提交于
      The <linux/device.h> header includes a lot of stuff, and
      it in turn gets a lot of use just for the basic "struct device"
      which appears so often.
      
      Clean up the users as follows:
      
      1) For those headers only needing "struct device" as a pointer
      in fcn args, replace the include with exactly that.
      
      2) For headers not really using anything from device.h, simply
      delete the include altogether.
      
      3) For headers relying on getting device.h implicitly before
      being included themselves, now explicitly include device.h
      
      4) For files in which doing #1 or #2 uncovers an implicit
      dependency on some other header, fix by explicitly adding
      the required header(s).
      
      Any C files that were implicitly relying on device.h to be
      present have already been dealt with in advance.
      
      Total removals from #1 and #2: 51.  Total additions coming
      from #3: 9.  Total other implicit dependencies from #4: 7.
      
      As of 3.3-rc1, there were 110, so a net removal of 42 gives
      about a 38% reduction in device.h presence in include/*
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      313162d0
  18. 27 1月, 2012 1 次提交
  19. 22 12月, 2011 1 次提交
    • K
      cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem · 8a25a2fd
      Kay Sievers 提交于
      This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
      and converts the devices to regular devices. The sysdev drivers are
      implemented as subsystem interfaces now.
      
      After all sysdev classes are ported to regular driver core entities, the
      sysdev implementation will be entirely removed from the kernel.
      
      Userspace relies on events and generic sysfs subsystem infrastructure
      from sysdev devices, which are made available with this conversion.
      
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@amd64.org>
      Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      8a25a2fd
  20. 12 12月, 2011 1 次提交
    • J
      driver-core/cpu: Expose hotpluggability to the rest of the kernel · 2987557f
      Josh Triplett 提交于
      When architectures register CPUs, they indicate whether the CPU allows
      hotplugging; notably, x86 and ARM don't allow hotplugging CPU 0.
      Userspace can easily query the hotpluggability of a CPU via sysfs;
      however, the kernel has no convenient way of accessing that property in
      an architecture-independent way.  While the kernel can simply try it and
      see, some code needs to distinguish between "hotplug failed" and
      "hotplug has no hope of working on this CPU"; for example, rcutorture's
      CPU hotplug tests want to avoid drowning out real hotplug failures with
      expected failures.
      
      Expose this property via a new cpu_is_hotpluggable function, so that the
      rest of the kernel can access it in an architecture-independent way.
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      2987557f
  21. 05 11月, 2011 1 次提交
  22. 26 7月, 2011 1 次提交
    • A
      notifiers: cpu: move cpu notifiers into cpu.h · 80f1ff97
      Amerigo Wang 提交于
      We presently define all kinds of notifiers in notifier.h.  This is not
      necessary at all, since different subsystems use different notifiers, they
      are almost non-related with each other.
      
      This can also save much build time.  Suppose I add a new netdevice event,
      really I don't have to recompile all the source, just network related.
      Without this patch, all the source will be recompiled.
      
      I move the notify events near to their subsystem notifier registers, so
      that they can be found more easily.
      
      This patch:
      
      It is not necessary to share the same notifier.h.
      Signed-off-by: NWANG Cong <amwang@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      80f1ff97
  23. 11 11月, 2010 1 次提交
  24. 29 6月, 2010 1 次提交
    • T
      workqueue: reimplement CPU hotplugging support using trustee · db7bccf4
      Tejun Heo 提交于
      Reimplement CPU hotplugging support using trustee thread.  On CPU
      down, a trustee thread is created and each step of CPU down is
      executed by the trustee and workqueue_cpu_callback() simply drives and
      waits for trustee state transitions.
      
      CPU down operation no longer waits for works to be drained but trustee
      sticks around till all pending works have been completed.  If CPU is
      brought back up while works are still draining,
      workqueue_cpu_callback() tells trustee to step down and tell workers
      to rebind to the cpu.
      
      As it's difficult to tell whether cwqs are empty if it's freezing or
      frozen, trustee doesn't consider draining to be complete while a gcwq
      is freezing or frozen (tracked by new GCWQ_FREEZING flag).  Also,
      workers which get unbound from their cpu are marked with WORKER_ROGUE.
      
      Trustee based implementation doesn't bring any new feature at this
      point but it will be used to manage worker pool when dynamic shared
      worker pool is implemented.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      db7bccf4
  25. 09 6月, 2010 2 次提交
    • T
      sched: adjust when cpu_active and cpuset configurations are updated during cpu on/offlining · 3a101d05
      Tejun Heo 提交于
      Currently, when a cpu goes down, cpu_active is cleared before
      CPU_DOWN_PREPARE starts and cpuset configuration is updated from a
      default priority cpu notifier.  When a cpu is coming up, it's set
      before CPU_ONLINE but cpuset configuration again is updated from the
      same cpu notifier.
      
      For cpu notifiers, this presents an inconsistent state.  Threads which
      a CPU_DOWN_PREPARE notifier expects to be bound to the CPU can be
      migrated to other cpus because the cpu is no more inactive.
      
      Fix it by updating cpu_active in the highest priority cpu notifier and
      cpuset configuration in the second highest when a cpu is coming up.
      Down path is updated similarly.  This guarantees that all other cpu
      notifiers see consistent cpu_active and cpuset configuration.
      
      cpuset_track_online_cpus() notifier is converted to
      cpuset_update_active_cpus() which just updates the configuration and
      now called from cpuset_cpu_[in]active() notifiers registered from
      sched_init_smp().  If cpuset is disabled, cpuset_update_active_cpus()
      degenerates into partition_sched_domains() making separate notifier
      for !CONFIG_CPUSETS unnecessary.
      
      This problem is triggered by cmwq.  During CPU_DOWN_PREPARE, hotplug
      callback creates a kthread and kthread_bind()s it to the target cpu,
      and the thread is expected to run on that cpu.
      
      * Ingo's test discovered __cpuinit/exit markups were incorrect.
        Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Menage <menage@google.com>
      3a101d05
    • T
      sched: define and use CPU_PRI_* enums for cpu notifier priorities · 50a323b7
      Tejun Heo 提交于
      Instead of hardcoding priority 10 and 20 in sched and perf, collect
      them into CPU_PRI_* enums.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      50a323b7
  26. 09 12月, 2009 2 次提交
    • G
      powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate · 51badebd
      Gautham R Shenoy 提交于
      Currently the cpu-allocation/deallocation process comprises of two steps:
      - Set the indicators and to update the device tree with DLPAR node
        information.
      
      - Online/offline the allocated/deallocated CPU.
      
      This is achieved by writing to the sysfs tunables "probe" during allocation
      and "release" during deallocation.
      
      At the sametime, the userspace can independently online/offline the CPUs of
      the system using the sysfs tunable "online".
      
      It is quite possible that when a userspace tool offlines a CPU
      for the purpose of deallocation and is in the process of updating the device
      tree, some other userspace tool could bring the CPU back online by writing to
      the "online" sysfs tunable thereby causing the deallocate process to fail.
      
      The solution to this is to serialize writes to the "probe/release" sysfs
      tunable with the writes to the "online" sysfs tunable.
      
      This patch employs a mutex to provide this serialization, which is a no-op on
      all architectures except PPC_PSERIES
      Signed-off-by: NGautham R Shenoy <ego@in.ibm.com>
      Acked-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      51badebd
    • N
      sysfs/cpu: Add probe/release files · 12633e80
      Nathan Fontenot 提交于
      Version 3 of this patch is updated with documentation added to
      Documentation/ABI.  There are no changes to any of the C code from v2
      of the patch.
      
      In order to support kernel DLPAR of CPU resources we need to provide an
      interface to add (probe) and remove (release) the resource from the system.
      This patch Creates new generic probe and release sysfs files to facilitate
      cpu probe/release.  The probe/release interface provides for allowing each
      arch to supply their own routines for implementing the backend of adding
      and removing cpus to/from the system.
      
      This also creates the powerpc specific stubs to handle the arch callouts
      from writes to the sysfs files.
      
      The creation and use of these files is regulated by the
      CONFIG_ARCH_CPU_PROBE_RELEASE option so that only architectures that need the
      capability will have the files created.
      Signed-off-by: NNathan Fontenot <nfont@austin.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      12633e80
  27. 16 8月, 2009 1 次提交
  28. 23 6月, 2009 1 次提交
  29. 03 4月, 2009 1 次提交
  30. 09 9月, 2008 1 次提交
    • M
      kernel/cpu.c: create a CPU_STARTING cpu_chain notifier · e545a614
      Manfred Spraul 提交于
      Right now, there is no notifier that is called on a new cpu, before the new
      cpu begins processing interrupts/softirqs.
      Various kernel function would need that notification, e.g. kvm works around
      by calling smp_call_function_single(), rcu polls cpu_online_map.
      
      The patch adds a CPU_STARTING notification. It also adds a helper function
      that sends the message to all cpu_chain handlers.
      
      Tested on x86-64.
      All other archs are untested. Especially on sparc, I'm not sure if I got
      it right.
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e545a614
  31. 26 7月, 2008 1 次提交
    • O
      workqueues: make get_online_cpus() useable for work->func() · 3da1c84c
      Oleg Nesterov 提交于
      workqueue_cpu_callback(CPU_DEAD) flushes cwq->thread under
      cpu_maps_update_begin().  This means that the multithreaded workqueues
      can't use get_online_cpus() due to the possible deadlock, very bad and
      very old problem.
      
      Introduce the new state, CPU_POST_DEAD, which is called after
      cpu_hotplug_done() but before cpu_maps_update_done().
      
      Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD.
      This means that create/destroy functions can't rely on get_online_cpus()
      any longer and should take cpu_add_remove_lock instead.
      
      [akpm@linux-foundation.org: fix CONFIG_SMP=n]
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: NGautham R Shenoy <ego@in.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3da1c84c
  32. 29 4月, 2008 1 次提交
    • S
      cpu: fix section mismatch warnings in hotcpu_register · f718e318
      Sam Ravnborg 提交于
      Fix following warnings:
      WARNING: vmlinux.o(.data+0x5020): Section mismatch in reference from the variable cpu_vsyscall_notifier_nb.12876 to the function .cpuinit.text:cpu_vsyscall_notifier()
      WARNING: vmlinux.o(.data+0x9ce0): Section mismatch in reference from the variable profile_cpu_callback_nb.17654 to the function .devinit.text:profile_cpu_callback()
      WARNING: vmlinux.o(.data+0xd380): Section mismatch in reference from the variable workqueue_cpu_callback_nb.15004 to the function .devinit.text:workqueue_cpu_callback()
      WARNING: vmlinux.o(.data+0x11d00): Section mismatch in reference from the variable relay_hotcpu_callback_nb.19626 to the function .cpuinit.text:relay_hotcpu_callback()
      WARNING: vmlinux.o(.data+0x12970): Section mismatch in reference from the variable cpu_callback_nb.24694 to the function .devinit.text:cpu_callback()
      WARNING: vmlinux.o(.data+0x3fee0): Section mismatch in reference from the variable percpu_counter_hotcpu_callback_nb.10903 to the function .cpuinit.text:percpu_counter_hotcpu_callback()
      WARNING: vmlinux.o(.data+0x74ce0): Section mismatch in reference from the variable topology_cpu_callback_nb.12506 to the function .cpuinit.text:topology_cpu_callback()
      
      Functions used as argument are by definition only used in HOTPLUG_CPU
      situations so thay are annotated __cpuinit.  Annotate the static variable used
      by hotcpu_register with __cpuinitdata to match this definition.
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f718e318
  33. 19 4月, 2008 1 次提交
  34. 26 1月, 2008 3 次提交
  35. 07 1月, 2008 1 次提交
    • I
      CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPU · a263898f
      Ingo Molnar 提交于
      make randconfig bootup testing found that the cpufreq code
      crashes on bootup, if the powernow-k8 driver is enabled and
      if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU
      kernel.
      
      First lockdep found out that there's an inconsistent unlock
      sequence:
      
       =====================================
       [ BUG: bad unlock balance detected! ]
       -------------------------------------
       swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at:
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       but there are no more locks to release!
      
      Call Trace:
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c
       [<ffffffff80252f3a>] mark_held_locks+0x56/0x94
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4
       ...
      
      then shortly afterwards the cpufreq code crashed on an assert:
      
       ------------[ cut here ]------------
       kernel BUG at drivers/cpufreq/cpufreq.c:1068!
       invalid opcode: 0000 [1] SMP
       [...]
       Call Trace:
        [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91
        [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2
        [<ffffffff80cc0596>] powernowk8_init+0x86/0x94
       [...]
       ---[ end trace 1e9219be2b4431de ]---
      
      the bug was caused by maxcpus=1 bootup, which brought up the
      secondary core as !cpu_online() but !cpu_is_offline() either,
      which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h):
      
        /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */
        static inline int cpu_is_offline(int cpu) { return 0; }
      
      but the cpufreq code uses cpu_online() and cpu_is_offline() in
      a mixed way - the low-level drivers use cpu_online(), while
      the cpufreq core uses cpu_is_offline(). This opened up the
      possibility to add the non-initialized sysdev device of the
      secondary core:
      
       cpufreq-core: trying to register driver powernow-k8
       cpufreq-core: adding CPU 0
       powernow-k8: BIOS error - no PSB or ACPI _PSS objects
       cpufreq-core: initialization failed
       cpufreq-core: adding CPU 1
       cpufreq-core: initialization failed
      
      which then blew up. The fix is to make cpu_is_offline() always
      the negation of cpu_online(). With that fix applied the kernel
      boots up fine without crashing:
      
       Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94()
       powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00)
       powernow-k8: BIOS error - no PSB or ACPI _PSS objects
       initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19.
       initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94()
       Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39()
      
      We could fix this by making CPU enumeration aware of max_cpus, but that
      would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu)
      possibility was quite confusing and a continuous source of bugs too.
      
      Most distributions have kernels with CPU hotplug enabled, so this bug
      remained hidden for a long time.
      
      Bug forensics:
      
      The broken cpu_is_offline() API variant was introduced via:
      
       commit a59d2e4e6977e7b94e003c96a41f07e96cddc340
       Author: Rusty Russell <rusty@rustcorp.com.au>
       Date:   Mon Mar 8 06:06:03 2004 -0800
      
           [PATCH] minor cleanups for hotplug CPUs
      
      ( this predates linux-2.6.git, this commit is available from Thomas's
        historic git tree. )
      
      Then 1.5 years later the cpufreq code made use of it:
      
       commit c32b6b8e
       Author: Ashok Raj <ashok.raj@intel.com>
       Date:   Sun Oct 30 14:59:54 2005 -0800
      
           [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers
      
       +       if (cpu_is_offline(cpu))
       +               return 0;
      
      which is a correct use of the subtly broken new API. v2.6.15 then
      shipped with this bug included.
      
      then it took two more years for random-kernel qa to hit it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a263898f