1. 26 9月, 2012 15 次提交
    • F
      x86: Use the new schedule_user API on userspace preemption · 0430499c
      Frederic Weisbecker 提交于
      This way we can exit the RCU extended quiescent state before
      we schedule a new task from irq/exception exit.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      0430499c
    • F
      rcu: Exit RCU extended QS on user preemption · 20ab65e3
      Frederic Weisbecker 提交于
      When exceptions or irq are about to resume userspace, if
      the task needs to be rescheduled, the arch low level code
      calls schedule() directly.
      
      If we call it, it is because we have the TIF_RESCHED flag:
      
      - It can be set after random local calls to set_need_resched()
      (RCU, drm, ...)
      
      - A wake up happened and the CPU needs preemption. This can
        happen in several ways:
      
          * Remotely: the remote waking CPU has set TIF_RESCHED and send the
            wakee an IPI to schedule the new task.
          * Remotely enqueued: the remote waking CPU sends an IPI to the target
            and the wake up is made by the target.
          * Locally: waking CPU == wakee CPU and the wakeup is done locally.
            set_need_resched() is called without IPI.
      
      In the case of local and remotely enqueued wake ups, the tick can
      be restarted when we enqueue the new task and RCU can exit the
      extended quiescent state at the same time. Then by the time we reach
      irq exit path and we call schedule, we are not in RCU user mode.
      
      But if we call schedule() only because something called set_need_resched(),
      RCU may still be in user mode when we reach schedule.
      
      Also if a wake up is done remotely, the CPU might see the TIF_RESCHED
      flag and call schedule while the IPI has not yet happen to restart the
      tick and exit RCU user mode.
      
      We need to manually protect against these corner cases.
      
      Create a new API schedule_user() that calls schedule() inside
      rcu_user_exit()-rcu_user_enter() in order to protect it. Archs
      will need to rely on it now to implement user preemption safely.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      20ab65e3
    • F
      rcu: Exit RCU extended QS on kernel preemption after irq/exception · 90a340ed
      Frederic Weisbecker 提交于
      When an exception or an irq exits, and we are going to resume into
      interrupted kernel code, the low level architecture code calls
      preempt_schedule_irq() if there is a need to reschedule.
      
      If the interrupt/exception occured between a call to rcu_user_enter()
      (from syscall exit, exception exit, do_notify_resume exit, ...) and
      a real resume to userspace (iret,...), preempt_schedule_irq() can be
      called whereas RCU thinks we are in userspace. But preempt_schedule_irq()
      is going to run kernel code and may be some RCU read side critical
      section. We must exit the userspace extended quiescent state before
      we call it.
      
      To solve this, just call rcu_user_exit() in the beginning of
      preempt_schedule_irq().
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      90a340ed
    • F
      x86: Exception hooks for userspace RCU extended QS · 6ba3c97a
      Frederic Weisbecker 提交于
      Add necessary hooks to x86 exception for userspace
      RCU extended quiescent state support.
      
      This includes traps, page fault, debug exceptions, etc...
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      6ba3c97a
    • F
      x86: Unspaghettize do_general_protection() · ef3f6288
      Frederic Weisbecker 提交于
      There is some unnatural label based layout in this function.
      Convert the unnecessary goto to readable conditional blocks.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      ef3f6288
    • F
      x86: Syscall hooks for userspace RCU extended QS · bf5a3c13
      Frederic Weisbecker 提交于
      Add syscall slow path hooks to notify syscall entry
      and exit on CPUs that want to support userspace RCU
      extended quiescent state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      bf5a3c13
    • F
      rcu: Switch task's syscall hooks on context switch · 04e7e951
      Frederic Weisbecker 提交于
      Clear the syscalls hook of a task when it's scheduled out so that if
      the task migrates, it doesn't run the syscall slow path on a CPU
      that might not need it.
      
      Also set the syscalls hook on the next task if needed.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      04e7e951
    • F
      rcu: Ignore userspace extended quiescent state by default · 1e1a689f
      Frederic Weisbecker 提交于
      By default we don't want to enter into RCU extended quiescent
      state while in userspace because doing this produces some overhead
      (eg: use of syscall slowpath). Set it off by default and ready to
      run when some feature like adaptive tickless need it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      1e1a689f
    • F
      rcu: Allow rcu_user_enter()/exit() to nest · c5d900bf
      Frederic Weisbecker 提交于
      Allow calls to rcu_user_enter() even if we are already
      in userspace (as seen by RCU) and allow calls to rcu_user_exit()
      even if we are already in the kernel.
      
      This makes the APIs more flexible to be called from architectures.
      Exception entries for example won't need to know if they come from
      userspace before calling rcu_user_exit().
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      c5d900bf
    • F
      rcu: Settle config for userspace extended quiescent state · 2b1d5024
      Frederic Weisbecker 提交于
      Create a new config option under the RCU menu that put
      CPUs under RCU extended quiescent state (as in dynticks
      idle mode) when they run in userspace. This require
      some contribution from architectures to hook into kernel
      and userspace boundaries.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      2b1d5024
    • P
      rcu: Make RCU_FAST_NO_HZ handle adaptive ticks · 9a0c6fef
      Paul E. McKenney 提交于
      The current implementation of RCU_FAST_NO_HZ tries reasonably hard to rid
      the current CPU of RCU callbacks.  This is appropriate when the CPU is
      entering idle, where it doesn't have much useful to do anyway, but is most
      definitely not what you want when transitioning to user-mode execution.
      This commit therefore detects the adaptive-tick case, and refrains from
      burning CPU time getting rid of RCU callbacks in that case.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      9a0c6fef
    • F
      rcu: New rcu_user_enter_after_irq() and rcu_user_exit_after_irq() APIs · 19dd1591
      Frederic Weisbecker 提交于
      In some cases, it is necessary to enter or exit userspace-RCU-idle mode
      from an interrupt handler, for example, if some other CPU sends this
      CPU a resched IPI.  In this case, the current CPU would enter the IPI
      handler in userspace-RCU-idle mode, but would need to exit the IPI handler
      after having exited that mode.
      
      To allow this to work, this commit adds two new APIs to TREE_RCU:
      
      - rcu_user_enter_after_irq(). This must be called from an interrupt between
      rcu_irq_enter() and rcu_irq_exit().  After the irq calls rcu_irq_exit(),
      the irq handler will return into an RCU extended quiescent state.
      In theory, this interrupt is never a nested interrupt, but in practice
      it might interrupt softirq, which looks to RCU like a nested interrupt.
      
      - rcu_user_exit_after_irq(). This must be called from a non-nesting
      interrupt, interrupting an RCU extended quiescent state, also
      between rcu_irq_enter() and rcu_irq_exit(). After the irq calls
      rcu_irq_exit(), the irq handler will return in an RCU non-quiescent
      state.
      
      [ Combined with "Allow calls to rcu_exit_user_irq from nesting irqs." ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      19dd1591
    • F
      rcu: New rcu_user_enter() and rcu_user_exit() APIs · adf5091e
      Frederic Weisbecker 提交于
      RCU currently insists that only idle tasks can enter RCU idle mode, which
      prohibits an adaptive tickless kernel (AKA nohz cpusets), which in turn
      would mean that usermode execution would always take scheduling-clock
      interrupts, even when there is only one task runnable on the CPU in
      question.
      
      This commit therefore adds rcu_user_enter() and rcu_user_exit(), which
      allow non-idle tasks to enter RCU idle mode.  These are quite similar
      to rcu_idle_enter() and rcu_idle_exit(), respectively, except that they
      omit the idle-task checks.
      
      [ Updated to use "user" flag rather than separate check functions. ]
      
      [ paulmck: Updated to drop exports of new functions based on Josh's patch
        getting rid of the need for them. ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      adf5091e
    • P
      Merge remote-tracking branch 'tip/core/rcu' into next.2012.09.25b · 593d1006
      Paul E. McKenney 提交于
      Resolved conflict in kernel/sched/core.c using Peter Zijlstra's
      approach from https://lkml.org/lkml/2012/9/5/585.
      593d1006
    • P
      Merge remote-tracking branch 'tip/smp/hotplug' into next.2012.09.25b · 5217192b
      Paul E. McKenney 提交于
      The conflicts between kernel/rcutree.h and kernel/rcutree_plugin.h
      were due to adjacent insertions and deletions, which were resolved
      by simply accepting the changes on both branches.
      5217192b
  2. 25 9月, 2012 2 次提交
    • I
      Merge tag 'v3.6-rc7' into core/rcu · 9b20aa63
      Ingo Molnar 提交于
      Merge Linux 3.6-rc7, to pick up fixes and to resolve a conflict in an
      upcoming pull.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9b20aa63
    • P
      Merge branches 'bigrt.2012.09.23a', 'doctorture.2012.09.23a',... · bda4ec9f
      Paul E. McKenney 提交于
      Merge branches 'bigrt.2012.09.23a', 'doctorture.2012.09.23a', 'fixes.2012.09.23a', 'hotplug.2012.09.23a' and 'idlechop.2012.09.23a' into HEAD
      
      bigrt.2012.09.23a contains additional commits to reduce scheduling latency
      	from RCU on huge systems (many hundrends or thousands of CPUs).
      
      doctorture.2012.09.23a contains documentation changes and rcutorture fixes.
      
      fixes.2012.09.23a contains miscellaneous fixes.
      
      hotplug.2012.09.23a contains CPU-hotplug-related changes.
      
      idle.2012.09.23a fixes architectures for which RCU no longer considered
      	the idle loop to be a quiescent state due to earlier
      	adaptive-dynticks changes.  Affected architectures are alpha,
      	cris, frv, h8300, m32r, m68k, mn10300, parisc, score, xtensa,
      	and ia64.
      bda4ec9f
  3. 24 9月, 2012 9 次提交
    • L
      Linux 3.6-rc7 · 979570e0
      Linus Torvalds 提交于
      979570e0
    • L
      Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 56bae802
      Linus Torvalds 提交于
      Pull kbuild fixes from Michal Marek:
       "There are two more kbuild fixes for 3.6.
      
        One fixes a race between x86's archscripts target and the rule
        (re)building scripts/basic/fixdep.  The second is a fix for the
        previous attempt at fixing make firmware_install with make 3.82.
        This new solution should work with any version of GNU make"
      
      * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        x86/kbuild: archscripts depends on scripts_basic
        firmware: fix directory creation rule matching with make 3.80
      56bae802
    • L
      Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · 0737c8d7
      Linus Torvalds 提交于
      Pull hwmon subsystem fixes from Jean Delvare.
      
      * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        hwmon: (fam15h_power) Tweak runavg_range on resume
        hwmon: (coretemp) Use get_online_cpus to avoid races involving CPU hotplug
        hwmon: (via-cputemp) Use get_online_cpus to avoid races involving CPU hotplug
      0737c8d7
    • L
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 0bf7a705
      Linus Torvalds 提交于
      Pull SCSI fixes from James Bottomley:
       "This is a set of four essential fixes: two oops related (bnx2i,
        virtio-scsi), one data corruption related (hpsa) and one failure to
        boot due to interrupt routing issues (mpt2ss).
      
        Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        [SCSI] hpsa: fix handling of protocol error
        [SCSI] mpt2sas: Fix for issue - Unable to boot from the drive connected to HBA
        [SCSI] bnx2i: Fixed NULL ptr deference for 1G bnx2 Linux iSCSI offload
        [SCSI] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list
      0bf7a705
    • S
      edac_mc: edac_mc_free() cannot assume mem_ctl_info is registered in sysfs. · faa2ad09
      Shaun Ruffell 提交于
      Fix potential NULL pointer dereference in edac_unregister_sysfs() on
      system boot introduced in 3.6-rc1.
      
      Since commit 7a623c03 ("edac: rewrite the sysfs code to use struct
      device") edac_mc_alloc() no longer initializes embedded kobjects in
      struct mem_ctl_info.  Therefore edac_mc_free() can no longer simply
      decrement a kobject reference count to free the allocated memory unless
      the memory controller driver module had also called edac_mc_add_mc().
      
      Now edac_mc_free() will check if the newly embedded struct device has
      been registered with sysfs before using either the standard device
      release functions or freeing the data structures itself with logic
      pulled out of the error path of edac_mc_alloc().
      
      The BUG this patch resolves for me:
      
        BUG: unable to handle kernel NULL pointer dereference at   (null)
        EIP is at __wake_up_common+0x1a/0x6a
        Process modprobe (pid: 933, ti=f3dc6000 task=f3db9520 task.ti=f3dc6000)
        Call Trace:
          complete_all+0x3f/0x50
          device_pm_remove+0x23/0xa2
          device_del+0x34/0x142
          edac_unregister_sysfs+0x3b/0x5c [edac_core]
          edac_mc_free+0x29/0x2f [edac_core]
          e7xxx_probe1+0x268/0x311 [e7xxx_edac]
          e7xxx_init_one+0x56/0x61 [e7xxx_edac]
          local_pci_probe+0x13/0x15
        ...
      
      Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
      Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
      Signed-off-by: NShaun Ruffell <sruffell@digium.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      faa2ad09
    • F
      edac_mc: fix messy kfree calls in the error path · ef6e7816
      Fengguang Wu 提交于
      coccinelle warns about:
      
      + drivers/edac/edac_mc.c:429:9-23: ERROR: reference preceded by free on line 429
      
         421         if (mci->csrows) {
       > 422                 for (chn = 0; chn < tot_channels; chn++) {
         423                         csr = mci->csrows[chn];
         424                         if (csr) {
       > 425                                 for (chn = 0; chn < tot_channels; chn++)
         426                                          kfree(csr->channels[chn]);
         427                                  kfree(csr);
         428                          }
       > 429                          kfree(mci->csrows[i]);
         430                  }
         431                  kfree(mci->csrows);
         432          }
      
      and that code block seem to mess things up in several ways (double free, memory
      leak, out-of-bound reads etc.):
      
      L422: The iterator "chn" and bound "tot_channels" are totally wrong. Should be
            "row" and "tot_csrows" respectively. Which means either memory leak, or
            out-of-bound reads (which if does not trigger an immediate page fault
            error, will further lead to kfree() on random addresses).
      
      L425: The inner loop is reusing the same iterator "chn" as the outer loop,
            which could lead to premature end of the outer loop, and hence memory leak.
      
      L429: The array index 'i' in mci->csrows[i] is a temporary value used in
            previous loops, and won't change at all in the current loop. Which
            means either out-of-bound read and possibly kfree(random number), or the
            same mci->csrows[i] get freed once and again, and possibly double free
            for the kfree(csr) in L427.
      
      L426/L427: a kfree(csr->channels) is needed in between to avoid leaking the memory.
      
      The buggy code was introduced by commit de3910eb ("edac: change the mem
      allocation scheme to make Documentation/kobject.txt happy") in the 3.6-rc1
      merge window. Fix it by freeing up resources in this order:
      
        free csrows[i]->channels[j]
        free csrows[i]->channels
        free csrows[i]
        free csrows
      
      CC: Mauro Carvalho Chehab <mchehab@redhat.com>
      CC: Shaun Ruffell <sruffell@digium.com>
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef6e7816
    • A
      hwmon: (fam15h_power) Tweak runavg_range on resume · 5f0ecb90
      Andreas Herrmann 提交于
      The quirk introduced with commit
      00250ec9 (hwmon: fam15h_power: fix
      bogus values with current BIOSes) is not only required during driver
      load but also when system resumes from suspend. The BIOS might set the
      previously recommended (but unsuitable) initilization value for the
      running average range register during resume.
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Tested-by: NAndreas Hartmann <andihartmann@01019freenet.de>
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: stable@vger.kernel.org # 3.0+
      5f0ecb90
    • S
      hwmon: (coretemp) Use get_online_cpus to avoid races involving CPU hotplug · 641f1456
      Silas Boyd-Wickizer 提交于
      coretemp_init loops with for_each_online_cpu, adding platform_devices
      and sysfs interfaces, then calls register_hotcpu_notifier.  There is a
      race if a CPU is offlined or onlined after the loop, but before
      register_hotcpu_notifier.  The race might result in the absence of a
      platform_device+sysfs interface for an online CPU, or the presence of
      a platform_device+sysfs interface for an offline CPU.  A similar race
      occurs during coretemp_exit, after the module calls
      unregister_hotcpu_notifier, but before it unregisters all devices, a
      CPU might offline and a device for an offline CPU will exist for a
      short while.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier
      with get_online_cpus+put_online_cpus; and surrounds
      unregister_hotcpu_notifier and device unregistering with
      get_online_cpus+put_online_cpus.
      
      Build tested.
      Signed-off-by: NSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      641f1456
    • S
      hwmon: (via-cputemp) Use get_online_cpus to avoid races involving CPU hotplug · 1ec3ddfd
      Silas Boyd-Wickizer 提交于
      via_cputemp_init loops with for_each_online_cpu, adding
      platform_devices, then calls register_hotcpu_notifier.  If a CPU is
      offlined between the loop and register_hotcpu_notifier, then later
      onlined, via_cputemp_device_add will attempt to add platform devices
      with the same ID.  A similar race occurs during via_cputemp_exit,
      after the module calls unregister_hotcpu_notifier, a CPU might offline
      and a device will exist for a CPU that is offline.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier
      with get_online_cpus+put_online_cpus; and surrounds
      unregister_hotcpu_notifier and device unregistering with
      get_online_cpus+put_online_cpus.
      
      Build tested.
      Signed-off-by: NSilas Boyd-Wickizer <sbw@mit.edu>
      Acked-by: NHarald Welte <laforge@gnumonks.org>
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      1ec3ddfd
  4. 23 9月, 2012 14 次提交
    • P
      ia64: Add missing RCU idle APIs on idle loop · 93482f4e
      Paul E. McKenney 提交于
      Traditionally, the entire idle task served as an RCU quiescent state.
      But when RCU read side critical sections started appearing within the
      idle loop, this traditional strategy became untenable.  The fix was to
      create new RCU APIs named rcu_idle_enter() and rcu_idle_exit(), which
      must be called by each architecture's idle loop so that RCU can tell
      when it is safe to ignore a given idle CPU.
      
      Unfortunately, this fix was never applied to ia64, a shortcoming remedied
      by this commit.
      
      Reported by: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested by: Tony Luck <tony.luck@intel.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      93482f4e
    • F
      xtensa: Add missing RCU idle APIs on idle loop · 11ad47a0
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the xtensa's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      11ad47a0
    • F
      score: Add missing RCU idle APIs on idle loop · 0ee23fda
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in scores's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      0ee23fda
    • F
      parisc: Add missing RCU idle APIs on idle loop · fbe75218
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the parisc's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Parisc <linux-parisc@vger.kernel.org>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      fbe75218
    • F
      mn10300: Add missing RCU idle APIs on idle loop · 5b0753a9
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the mn10300's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      5b0753a9
    • F
      m68k: Add missing RCU idle APIs on idle loop · 5b57ba37
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the m68k's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: m68k <linux-m68k@lists.linux-m68k.org>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      5b57ba37
    • F
      m32r: Add missing RCU idle APIs on idle loop · 48ae077c
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the m32r's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      48ae077c
    • F
      h8300: Add missing RCU idle APIs on idle loop · b2fe1430
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the h8300's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      b2fe1430
    • F
      frv: Add missing RCU idle APIs on idle loop · 41d8fe5b
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the Frv's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      41d8fe5b
    • F
      cris: Add missing RCU idle APIs on idle loop · c633f9e7
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the Cris's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Cris <linux-cris-kernel@axis.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      c633f9e7
    • F
      alpha: Add missing RCU idle APIs on idle loop · 4c94cada
      Frederic Weisbecker 提交于
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the Alpha's idle loop.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NMichael Cree <mcree@orcon.net.nz>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: alpha <linux-alpha@vger.kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      4c94cada
    • F
      alpha: Fix preemption handling in idle loop · 6a6c0272
      Frederic Weisbecker 提交于
      cpu_idle() is called on the boot CPU by the init code with
      preemption disabled. But the cpu_idle() function in alpha
      doesn't handle this when it calls schedule() directly.
      
      Fix it by converting it into schedule_preempt_disabled().
      
      Also disable preemption before calling cpu_idle() from
      secondary CPU entry code to stay consistent with this
      state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NMichael Cree <mcree@orcon.net.nz>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: alpha <linux-alpha@vger.kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      6a6c0272
    • S
      Use get_online_cpus to avoid races involving CPU hotplug · 429227bb
      Silas Boyd-Wickizer 提交于
      If arch/x86/kernel/cpuid.c is a module, a CPU might offline or online
      between the for_each_online_cpu() loop and the call to
      register_hotcpu_notifier in cpuid_init or the call to
      unregister_hotcpu_notifier in cpuid_exit.  The potential races can
      lead to leaks/duplicates, attempts to destroy non-existant devices, or
      random pointer dereferences.
      
      For example, in cpuid_exit if:
      
              for_each_online_cpu(cpu)
                      cpuid_device_destroy(cpu);
              class_destroy(cpuid_class);
              __unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid");
              <----- CPU onlines
              unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);
      
      the hotcpu notifier will attempt to create a device for the
      cpuid_class, which the module already destroyed.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
      unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.
      
      Tested on a VM.
      Signed-off-by: NSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      429227bb
    • S
      Use get_online_cpus to avoid races involving CPU hotplug · a2db672a
      Silas Boyd-Wickizer 提交于
      If arch/x86/kernel/msr.c is a module, a CPU might offline or online
      between the for_each_online_cpu(i) loop and the call to
      register_hotcpu_notifier in msr_init or the call to
      unregister_hotcpu_notifier in msr_exit. The potential races can lead
      to leaks/duplicates, attempts to destroy non-existant devices, or
      random pointer dereferences.
      
      For example, in msr_init if:
      
              for_each_online_cpu(i) {
                      err = msr_device_create(i);
                      if (err != 0)
                              goto out_class;
              }
              <----- CPU offlines
              register_hotcpu_notifier(&msr_class_cpu_notifier);
      
      and the CPU never onlines before msr_exit, then the module will never
      call msr_device_destroy for the associated CPU.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
      unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.
      
      Tested on a VM.
      Signed-off-by: NSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a2db672a