1. 27 3月, 2015 1 次提交
  2. 16 2月, 2015 1 次提交
    • R
      PM / sleep: Make it possible to quiesce timers during suspend-to-idle · 124cf911
      Rafael J. Wysocki 提交于
      The efficiency of suspend-to-idle depends on being able to keep CPUs
      in the deepest available idle states for as much time as possible.
      Ideally, they should only be brought out of idle by system wakeup
      interrupts.
      
      However, timer interrupts occurring periodically prevent that from
      happening and it is not practical to chase all of the "misbehaving"
      timers in a whack-a-mole fashion.  A much more effective approach is
      to suspend the local ticks for all CPUs and the entire timekeeping
      along the lines of what is done during full suspend, which also
      helps to keep suspend-to-idle and full suspend reasonably similar.
      
      The idea is to suspend the local tick on each CPU executing
      cpuidle_enter_freeze() and to make the last of them suspend the
      entire timekeeping.  That should prevent timer interrupts from
      triggering until an IO interrupt wakes up one of the CPUs.  It
      needs to be done with interrupts disabled on all of the CPUs,
      though, because otherwise the suspended clocksource might be
      accessed by an interrupt handler which might lead to fatal
      consequences.
      
      Unfortunately, the existing ->enter callbacks provided by cpuidle
      drivers generally cannot be used for implementing that, because some
      of them re-enable interrupts temporarily and some idle entry methods
      cause interrupts to be re-enabled automatically on exit.  Also some
      of these callbacks manipulate local clock event devices of the CPUs
      which really shouldn't be done after suspending their ticks.
      
      To overcome that difficulty, introduce a new cpuidle state callback,
      ->enter_freeze, that will be guaranteed (1) to keep interrupts
      disabled all the time (and return with interrupts disabled) and (2)
      not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
      look for the deepest available idle state with ->enter_freeze present
      and to make the CPU execute that callback with suspended tick (and the
      last of the online CPUs to execute it with suspended timekeeping).
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      124cf911
  3. 14 9月, 2014 1 次提交
    • F
      nohz: Move nohz full init call to tick init · a80e49e2
      Frederic Weisbecker 提交于
      This way we unbloat a bit main.c and more importantly we initialize
      nohz full after init_IRQ(). This dependency will be needed in further
      patches because nohz full needs irq work to raise its own IRQ.
      Information about the support for this ability on ARM64 is obtained on
      init_IRQ() which initialize the pointer to __smp_call_function.
      
      Since tick_init() is called right after init_IRQ(), this is a good place
      to call tick_nohz_init() and prepare for that dependency.
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      a80e49e2
  4. 27 8月, 2014 1 次提交
  5. 16 4月, 2014 1 次提交
  6. 26 3月, 2014 2 次提交
  7. 24 12月, 2013 1 次提交
    • J
      tick/timekeeping: Call update_wall_time outside the jiffies lock · 47a1b796
      John Stultz 提交于
      Since the xtime lock was split into the timekeeping lock and
      the jiffies lock, we no longer need to call update_wall_time()
      while holding the jiffies lock.
      
      Thus, this patch splits update_wall_time() out from do_timer().
      
      This allows us to get away from calling clock_was_set_delayed()
      in update_wall_time() and instead use the standard clock_was_set()
      call that previously would deadlock, as it causes the jiffies lock
      to be acquired.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      47a1b796
  8. 19 11月, 2013 1 次提交
  9. 02 7月, 2013 1 次提交
    • T
      tick: Sanitize broadcast control logic · 07bd1172
      Thomas Gleixner 提交于
      The recent implementation of a generic dummy timer resulted in a
      different registration order of per cpu local timers which made the
      broadcast control logic go belly up.
      
      If the dummy timer is the first clock event device which is registered
      for a CPU, then it is installed, the broadcast timer is initialized
      and the CPU is marked as broadcast target.
      
      If a real clock event device is installed after that, we can fail to
      take the CPU out of the broadcast mask. In the worst case we end up
      with two periodic timer events firing for the same CPU. One from the
      per cpu hardware device and one from the broadcast.
      
      Now the problem is that we have no way to distinguish whether the
      system is in a state which makes broadcasting necessary or the
      broadcast bit was set due to the nonfunctional dummy timer
      installment.
      
      To solve this we need to keep track of the system state seperately and
      provide a more detailed decision logic whether we keep the CPU in
      broadcast mode or not.
      
      The old decision logic only clears the broadcast mode, if the newly
      installed clock event device is not affected by power states.
      
      The new logic clears the broadcast mode if one of the following is
      true:
      
        - The new device is not affected by power states.
      
        - The system is not in a power state affected mode
      
        - The system has switched to oneshot mode. The oneshot broadcast is
          controlled from the deep idle state. The CPU is not in idle at
          this point, so it's safe to remove it from the mask.
      
      If we clear the broadcast bit for the CPU when a new device is
      installed, we also shutdown the broadcast device when this was the
      last CPU in the broadcast mask.
      
      If the broadcast bit is kept, then we leave the new device in shutdown
      state and rely on the broadcast to deliver the timer interrupts via
      the broadcast ipis.
      Reported-and-tested-by: NStehle Vincent-B46079 <B46079@freescale.com>
      Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
      Cc: John Stultz <john.stultz@linaro.org>,
      Cc: Mark Rutland <mark.rutland@arm.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307012153060.4013@ionos.tec.linutronix.de
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      07bd1172
  10. 25 6月, 2013 1 次提交
    • S
      clockevents: Prefer CPU local devices over global devices · 70e5975d
      Stephen Boyd 提交于
      On an SMP system with only one global clockevent and a dummy
      clockevent per CPU we run into problems. We want the dummy
      clockevents to be registered as the per CPU tick devices, but
      we can only achieve that if we register the dummy clockevents
      before the global clockevent or if we artificially inflate the
      rating of the dummy clockevents to be higher than the rating
      of the global clockevent. Failure to do so leads to boot
      hangs when the dummy timers are registered on all other CPUs
      besides the CPU that accepted the global clockevent as its tick
      device and there is no broadcast timer to poke the dummy
      devices.
      
      If we're registering multiple clockevents and one clockevent is
      global and the other is local to a particular CPU we should
      choose to use the local clockevent regardless of the rating of
      the device. This way, if the clockevent is a dummy it will take
      the tick device duty as long as there isn't a higher rated tick
      device and any global clockevent will be bumped out into
      broadcast mode, fixing the problem described above.
      Reported-and-tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Tested-by: soren.brinkmann@xilinx.com
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: John Stultz <john.stultz@linaro.org>
      Link: http://lkml.kernel.org/r/20130613183950.GA32061@codeaurora.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      70e5975d
  11. 16 5月, 2013 6 次提交
  12. 25 4月, 2013 1 次提交
    • T
      clockevents: Set dummy handler on CPU_DEAD shutdown · 6f7a05d7
      Thomas Gleixner 提交于
      Vitaliy reported that a per cpu HPET timer interrupt crashes the
      system during hibernation. What happens is that the per cpu HPET timer
      gets shut down when the nonboot cpus are stopped. When the nonboot
      cpus are onlined again the HPET code sets up the MSI interrupt which
      fires before the clock event device is registered. The event handler
      is still set to hrtimer_interrupt, which then crashes the machine due
      to highres mode not being active.
      
      See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700333
      
      There is no real good way to avoid that in the HPET code. The HPET
      code alrady has a mechanism to detect spurious interrupts when event
      handler == NULL for a similar reason.
      
      We can handle that in the clockevent/tick layer and replace the
      previous functional handler with a dummy handler like we do in
      tick_setup_new_device().
      
      The original clockevents code did this in clockevents_exchange_device(),
      but that got removed by commit 7c1e7689 (clockevents: prevent
      clockevent event_handler ending up handler_noop) which forgot to fix
      it up in tick_shutdown(). Same issue with the broadcast device.
      Reported-by: NVitaliy Fillipov <vitalif@yourcmc.ru>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: stable@vger.kernel.org
      Cc: 700333@bugs.debian.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      6f7a05d7
  13. 16 4月, 2013 1 次提交
    • F
      nohz: Switch from "extended nohz" to "full nohz" based naming · c5bfece2
      Frederic Weisbecker 提交于
      "Extended nohz" was used as a naming base for the full dynticks
      API and Kconfig symbols. It reflects the fact the system tries
      to stop the tick in more places than just idle.
      
      But that "extended" name is a bit opaque and vague. Rename it to
      "full" makes it clearer what the system tries to do under this
      config: try to shutdown the tick anytime it can. The various
      constraints that prevent that to happen shouldn't be considered
      as fundamental properties of this feature but rather technical
      issues that may be solved in the future.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      c5bfece2
  14. 21 3月, 2013 1 次提交
    • F
      nohz: Assign timekeeping duty to a CPU outside the full dynticks range · a382bf93
      Frederic Weisbecker 提交于
      This way the full nohz CPUs can safely run with the tick
      stopped with a guarantee that somebody else is taking
      care of the jiffies and GTOD progression.
      
      Once the duty is attributed to a CPU, it won't change. Also that
      CPU can't enter into dyntick idle mode or be hot unplugged.
      
      This may later be improved from a power consumption POV. At
      least we should be able to share the duty amongst all CPUs
      outside the full dynticks range. Then the duty could even be
      shared with full dynticks CPUs when those can't stop their
      tick for any reason.
      
      But let's start with that very simple approach first.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      [fix have_nohz_full_mask offcase]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a382bf93
  15. 07 3月, 2013 1 次提交
  16. 14 11月, 2012 1 次提交
  17. 08 9月, 2011 1 次提交
  18. 26 2月, 2011 1 次提交
    • T
      clockevents: Prevent oneshot mode when broadcast device is periodic · 3a142a06
      Thomas Gleixner 提交于
      When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only
      can switch into oneshot mode, when the backup broadcast device
      supports oneshot mode as well. Otherwise we would try to switch the
      broadcast device into an unsupported mode unconditionally. This went
      unnoticed so far as the current available broadcast devices support
      oneshot mode. Seth unearthed this problem while debugging and working
      around an hpet related BIOS wreckage.
      
      Add the necessary check to tick_is_oneshot_available().
      Reported-and-tested-by: NSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <alpine.LFD.2.00.1102252231200.2701@localhost6.localdomain6>
      Cc: stable@kernel.org # .21 ->
      3a142a06
  19. 01 2月, 2011 1 次提交
  20. 17 12月, 2010 1 次提交
  21. 15 12月, 2009 2 次提交
  22. 02 5月, 2009 1 次提交
    • J
      clockevents: prevent endless loop in tick_handle_periodic() · 74a03b69
      john stultz 提交于
      tick_handle_periodic() can lock up hard when a one shot clock event
      device is used in combination with jiffies clocksource.
      
      Avoid an endless loop issue by requiring that a highres valid
      clocksource be installed before we call tick_periodic() in a loop when
      using ONESHOT mode. The result is we will only increment jiffies once
      per interrupt until a continuous hardware clocksource is available.
      
      Without this, we can run into a endless loop, where each cycle through
      the loop, jiffies is updated which increments time by tick_period or
      more (due to clock steering), which can cause the event programming to
      think the next event was before the newly incremented time and fail
      causing tick_periodic() to be called again and the whole process loops
      forever.
      
      [ Impact: prevent hard lock up ]
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org
      74a03b69
  23. 31 1月, 2009 1 次提交
    • S
      hrtimers: allow the hot-unplugging of all cpus · 94df7de0
      Sebastien Dugue 提交于
      Impact: fix CPU hotplug hang on Power6 testbox
      
      On architectures that support offlining all cpus (at least powerpc/pseries),
      hot-unpluging the tick_do_timer_cpu can result in a system hang.
      
      This comes from the fact that if the cpu going down happens to be the
      cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
      cpu is dead (via the CPU_DEAD notification), we're left without ticks,
      jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
      That's particularly the case for the cpu looping in __cpu_die() waiting
      for the dying cpu to be dead.
      
      This patch addresses this by having the tick_do_timer_cpu handover happen
      earlier during the CPU_DYING notification. For this, a new clockevent
      notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
      in hrtimer_cpu_notify().
      Signed-off-by: NSebastien Dugue <sebastien.dugue@bull.net>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94df7de0
  24. 01 1月, 2009 1 次提交
    • R
      cpumask: convert kernel time functions · 6b954823
      Rusty Russell 提交于
      Impact: Use new APIs
      
      Convert kernel/time functions to use struct cpumask *.
      
      Note the ugly bitmap declarations in tick-broadcast.c.  These should
      be cpumask_var_t, but there was no obvious initialization function to
      put the alloc_cpumask_var() calls in.  This was safe.
      
      (Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK,
      so we use a bitmap here to show we really mean it).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NMike Travis <travis@sgi.com>
      6b954823
  25. 30 12月, 2008 1 次提交
    • S
      hrtimers: allow the hot-unplugging of all cpus · 5762ba18
      Sebastien Dugue 提交于
      Impact: fix CPU hotplug hang on Power6 testbox
      
      On architectures that support offlining all cpus (at least powerpc/pseries),
      hot-unpluging the tick_do_timer_cpu can result in a system hang.
      
      This comes from the fact that if the cpu going down happens to be the
      cpu doing the tick, then as the tick_do_timer_cpu handover happens after the
      cpu is dead (via the CPU_DEAD notification), we're left without ticks,
      jiffies are frozen and any task relying on timers (msleep, ...) is stuck.
      That's particularly the case for the cpu looping in __cpu_die() waiting
      for the dying cpu to be dead.
      
      This patch addresses this by having the tick_do_timer_cpu handover happen
      earlier during the CPU_DYING notification. For this, a new clockevent
      notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered
      in hrtimer_cpu_notify().
      Signed-off-by: NSebastien Dugue <sebastien.dugue@bull.net>
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5762ba18
  26. 13 12月, 2008 2 次提交
  27. 23 9月, 2008 2 次提交
    • T
      clockevents: prevent mode mismatch on cpu online · 27ce4cb4
      Thomas Gleixner 提交于
      Impact: timer hang on CPU online observed on AMD C1E systems
      
      When a CPU is brought online then the broadcast machinery can
      be in the one shot state already. Check this and setup the timer 
      device of the new CPU in one shot mode so the broadcast code
      can pick up the next_event value correctly.
      
      Another AMD C1E oddity, as we switch to broadcast immediately and
      not after the full bring up via the ACPI cpu idle code.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      27ce4cb4
    • T
      clockevents: prevent cpu online to interfere with nohz · 6441402b
      Thomas Gleixner 提交于
      Impact: rare hang which can be triggered on CPU online.
      
      tick_do_timer_cpu keeps track of the CPU which updates jiffies
      via do_timer. The value -1 is used to signal, that currently no
      CPU is doing this. There are two cases, where the variable can 
      have this state:
      
       boot:
          necessary for systems where the boot cpu id can be != 0
      
       nohz long idle sleep:
          When the CPU which did the jiffies update last goes into
          a long idle sleep it drops the update jiffies duty so
          another CPU which is not idle can pick it up and keep
          jiffies going.
      
      Using the same value for both situations is wrong, as the CPU online
      code can see the -1 state when the timer of the newly onlined CPU is
      setup. The setup for a newly onlined CPU goes through periodic mode
      and can pick up the do_timer duty without being aware of the nohz /
      highres mode of the already running system.
      
      Use two separate states and make them constants to avoid magic
      numbers confusion. 
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      6441402b
  28. 17 9月, 2008 1 次提交
    • T
      clockevents: make device shutdown robust · 2344abbc
      Thomas Gleixner 提交于
      The device shut down does not cleanup the next_event variable of the
      clock event device. So when the device is reactivated the possible
      stale next_event value can prevent the device to be reprogrammed as it
      claims to wait on a event already.
      
      This is the root cause of the resurfacing suspend/resume problem,
      where systems need key press to come back to life.
      
      Fix this by setting next_event to KTIME_MAX when the device is shut
      down. Use a separate function for shutdown which takes care of that
      and only keep the direct set mode call in the broadcast code, where we
      can not touch the next_event value.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      2344abbc
  29. 05 9月, 2008 1 次提交
    • V
      clockevents: prevent clockevent event_handler ending up handler_noop · 7c1e7689
      Venkatesh Pallipadi 提交于
      There is a ordering related problem with clockevents code, due to which
      clockevents_register_device() called after tickless/highres switch
      will not work. The new clockevent ends up with clockevents_handle_noop as
      event handler, resulting in no timer activity.
      
      The problematic path seems to be
      
      * old device already has hrtimer_interrupt as the event_handler
      * new clockevent device registers with a higher rating
      * tick_check_new_device() is called
        * clockevents_exchange_device() gets called
          * old->event_handler is set to clockevents_handle_noop
        * tick_setup_device() is called for the new device
          * which sets new->event_handler using the old->event_handler which is noop.
      
      Change the ordering so that new device inherits the proper handler.
      
      This does not have any issue in normal case as most likely all the clockevent
      devices are setup before the highres switch. But, can potentially be affecting
      some corner case where HPET force detect happens after the highres switch.
      This was a problem with HPET in MSI mode code that we have been experimenting
      with.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7c1e7689
  30. 26 7月, 2008 1 次提交
  31. 19 7月, 2008 1 次提交