1. 31 12月, 2011 1 次提交
  2. 19 12月, 2011 1 次提交
  3. 12 12月, 2011 4 次提交
    • F
      nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu() · 1268fbc7
      Frederic Weisbecker 提交于
      Those two APIs were provided to optimize the calls of
      tick_nohz_idle_enter() and rcu_idle_enter() into a single
      irq disabled section. This way no interrupt happening in-between would
      needlessly process any RCU job.
      
      Now we are talking about an optimization for which benefits
      have yet to be measured. Let's start simple and completely decouple
      idle rcu and dyntick idle logics to simplify.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1268fbc7
    • F
      nohz: Allow rcu extended quiescent state handling seperately from tick stop · 2bbb6817
      Frederic Weisbecker 提交于
      It is assumed that rcu won't be used once we switch to tickless
      mode and until we restart the tick. However this is not always
      true, as in x86-64 where we dereference the idle notifiers after
      the tick is stopped.
      
      To prepare for fixing this, add two new APIs:
      tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().
      
      If no use of RCU is made in the idle loop between
      tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch
      must instead call the new *_norcu() version such that the arch doesn't
      need to call rcu_idle_enter() and rcu_idle_exit().
      
      Otherwise the arch must call tick_nohz_enter_idle() and
      tick_nohz_exit_idle() and also call explicitly:
      
      - rcu_idle_enter() after its last use of RCU before the CPU is put
      to sleep.
      - rcu_idle_exit() before the first use of RCU after the CPU is woken
      up.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: David Miller <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      2bbb6817
    • F
      nohz: Separate out irq exit and idle loop dyntick logic · 280f0677
      Frederic Weisbecker 提交于
      The tick_nohz_stop_sched_tick() function, which tries to delay
      the next timer tick as long as possible, can be called from two
      places:
      
      - From the idle loop to start the dytick idle mode
      - From interrupt exit if we have interrupted the dyntick
      idle mode, so that we reprogram the next tick event in
      case the irq changed some internal state that requires this
      action.
      
      There are only few minor differences between both that
      are handled by that function, driven by the ts->inidle
      cpu variable and the inidle parameter. The whole guarantees
      that we only update the dyntick mode on irq exit if we actually
      interrupted the dyntick idle mode, and that we enter in RCU extended
      quiescent state from idle loop entry only.
      
      Split this function into:
      
      - tick_nohz_idle_enter(), which sets ts->inidle to 1, enters
      dynticks idle mode unconditionally if it can, and enters into RCU
      extended quiescent state.
      
      - tick_nohz_irq_exit() which only updates the dynticks idle mode
      when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called).
      
      To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed
      into tick_nohz_idle_exit().
      
      This simplifies the code and micro-optimize the irq exit path (no need
      for local_irq_save there). This also prepares for the split between
      dynticks and rcu extended quiescent state logics. We'll need this split to
      further fix illegal uses of RCU in extended quiescent states in the idle
      loop.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: David Miller <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      280f0677
    • P
      rcu: Track idleness independent of idle tasks · 9b2e4f18
      Paul E. McKenney 提交于
      Earlier versions of RCU used the scheduling-clock tick to detect idleness
      by checking for the idle task, but handled idleness differently for
      CONFIG_NO_HZ=y.  But there are now a number of uses of RCU read-side
      critical sections in the idle task, for example, for tracing.  A more
      fine-grained detection of idleness is therefore required.
      
      This commit presses the old dyntick-idle code into full-time service,
      so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
      always invoked at the beginning of an idle loop iteration.  Similarly,
      rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
      at the end of an idle-loop iteration.  This allows the idle task to
      use RCU everywhere except between consecutive rcu_idle_enter() and
      rcu_idle_exit() calls, in turn allowing architecture maintainers to
      specify exactly where in the idle loop that RCU may be used.
      
      Because some of the userspace upcall uses can result in what looks
      to RCU like half of an interrupt, it is not possible to expect that
      the irq_enter() and irq_exit() hooks will give exact counts.  This
      patch therefore expands the ->dynticks_nesting counter to 64 bits
      and uses two separate bitfields to count process/idle transitions
      and interrupt entry/exit transitions.  It is presumed that userspace
      upcalls do not happen in the idle loop or from usermode execution
      (though usermode might do a system call that results in an upcall).
      The counter is hard-reset on each process/idle transition, which
      avoids the interrupt entry/exit error from accumulating.  Overflow
      is avoided by the 64-bitness of the ->dyntick_nesting counter.
      
      This commit also adds warnings if a non-idle task asks RCU to enter
      idle state (and these checks will need some adjustment before applying
      Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
      In addition, validation of ->dynticks and ->dynticks_nesting is added.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      9b2e4f18
  4. 06 12月, 2011 1 次提交
  5. 02 12月, 2011 3 次提交
  6. 18 11月, 2011 1 次提交
  7. 11 11月, 2011 1 次提交
    • J
      clocksource: Avoid selecting mult values that might overflow when adjusted · d65670a7
      John Stultz 提交于
      For some frequencies, the clocks_calc_mult_shift() function will
      unfortunately select mult values very close to 0xffffffff.  This
      has the potential to overflow when NTP adjusts the clock, adding
      to the mult value.
      
      This patch adds a clocksource.maxadj value, which provides
      an approximation of an 11% adjustment(NTP limits adjustments to
      500ppm and the tick adjustment is limited to 10%), which could
      be made to the clocksource.mult value. This is then used to both
      check that the current mult value won't overflow/underflow, as
      well as warning us if the timekeeping_adjust() code pushes over
      that 11% boundary.
      
      v2: Fix max_adjustment calculation, and improve WARN_ONCE
      messages.
      
      v3: Don't warn before maxadj has actually been set
      
      CC: Yong Zhang <yong.zhang0@gmail.com>
      CC: David Daney <ddaney.cavm@gmail.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Chen Jie <chenj@lemote.com>
      CC: zhangfx <zhangfx@lemote.com>
      CC: stable@kernel.org
      Reported-by: NChen Jie <chenj@lemote.com>
      Reported-by: Nzhangfx <zhangfx@lemote.com>
      Tested-by: NYong Zhang <yong.zhang0@gmail.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      d65670a7
  8. 01 11月, 2011 1 次提交
  9. 28 10月, 2011 1 次提交
  10. 29 9月, 2011 1 次提交
  11. 14 9月, 2011 1 次提交
  12. 13 9月, 2011 2 次提交
  13. 08 9月, 2011 5 次提交
  14. 11 8月, 2011 9 次提交
  15. 10 8月, 2011 2 次提交
  16. 21 7月, 2011 1 次提交
    • J
      time: Fix stupid KERN_WARN compile issue · cbaa5152
      John Stultz 提交于
      Terribly embarassing. Don't know how I committed this, but its
      KERN_WARNING not KERN_WARN.
      
      This fixes the following compile error:
      kernel/time/timekeeping.c: In function ‘__timekeeping_inject_sleeptime’:
      kernel/time/timekeeping.c:608: error: ‘KERN_WARN’ undeclared (first use in this function)
      kernel/time/timekeeping.c:608: error: (Each undeclared identifier is reported only once
      kernel/time/timekeeping.c:608: error: for each function it appears in.)
      kernel/time/timekeeping.c:608: error: expected ‘)’ before string constant
      make[2]: *** [kernel/time/timekeeping.o] Error 1
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      cbaa5152
  17. 22 6月, 2011 4 次提交
    • J
      time: Avoid accumulating time drift in suspend/resume · cb33217b
      John Stultz 提交于
      Because the read_persistent_clock interface is usually backed by
      only a second granular interface, each time we read from the persistent
      clock for suspend/resume, we introduce a half second (on average) of error.
      
      In order to avoid this error accumulating as the system is suspended
      over and over, this patch measures the time delta between the persistent
      clock and the system CLOCK_REALTIME.
      
      If the delta is less then 2 seconds from the last suspend, we compensate
      by using the previous time delta (keeping it close). If it is larger
      then 2 seconds, we assume the clock was set or has been changed, so we
      do no correction and update the delta.
      
      Note: If NTP is running, ths could seem to "fight" with the NTP corrected
      time, where as if the system time was off by 1 second, and NTP slewed the
      value in, a suspend/resume cycle could undo this correction, by trying to
      restore the previous offset from the persistent clock.  However, without
      this patch, since each read could cause almost a full second worth of
      error, its possible to get almost 2 seconds of error just from the
      suspend/resume cycle alone, so this about equal to any offset added by
      the compensation.
      
      Further on systems that suspend/resume frequently, this should keep time
      closer then NTP could compensate for if the errors were allowed to
      accumulate.
      
      Credits to Arve Hjønnevåg for suggesting this solution.
      
      CC: Arve Hjønnevåg <arve@android.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      cb33217b
    • J
      time: Catch invalid timespec sleep values in __timekeeping_inject_sleeptime · cb5de2f8
      John Stultz 提交于
      Arve suggested making sure we catch possible negative sleep time
      intervals that could be passed into timekeeping_inject_sleeptime.
      
      CC: Arve Hjønnevåg <arve@android.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      cb5de2f8
    • J
      alarmtimers: Return -ENOTSUPP if no RTC device is present · 1c6b39ad
      John Stultz 提交于
      Toralf Förster and Richard Weinberger noted that if there is
      no RTC device, the alarm timers core prints out an annoying
      "ALARM timers will not wake from suspend" message.
      
      This warning has been removed in a previous patch, however
      the issue still remains:  The original idea was to support
      alarm timers even if there was no rtc device, as long as the
      system didn't go into suspend.
      
      However, after further consideration, communicating to the application
      that alarmtimers are not fully functional seems like the better
      solution.
      
      So this patch makes it so we return -ENOTSUPP to any posix _ALARM
      clockid calls if there is no backing RTC device on the system.
      
      Further this changes the behavior where when there is no rtc device
      we will check for one on clock_getres, clock_gettime, timer_create,
      and timer_nsleep instead of on suspend.
      
      CC: Toralf Förster <toralf.foerster@gmx.de>
      CC: Richard Weinberger <richard@nod.at
      CC: Peter Zijlstra <peterz@infradead.org>
      CC: Thomas Gleixner <tglx@linutronix.de>
      Reported-by: NToralf Förster <toralf.foerster@gmx.de>
      Reported by: Richard Weinberger <richard@nod.at>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      1c6b39ad
    • J
      alarmtimers: Handle late rtc module loading · c008ba58
      John Stultz 提交于
      The alarmtimers code currently picks a rtc device to use at
      late init time. However, if your rtc driver is loaded as a module,
      it may be registered after the alarmtimers late init code, leaving
      the alarmtimers nonfunctional.
      
      This patch moves the the rtcdevice selection to when we actually try
      to use it, allowing us to make use of rtc modules that may have been
      loaded at any point since bootup.
      
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Meelis Roos <mroos@ut.ee>
      Reported-by: NMeelis Roos <mroos@ut.ee>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      c008ba58
  18. 17 6月, 2011 1 次提交
    • T
      clocksource: Make watchdog robust vs. interruption · b5199515
      Thomas Gleixner 提交于
      The clocksource watchdog code is interruptible and it has been
      observed that this can trigger false positives which disable the TSC.
      
      The reason is that an interrupt storm or a long running interrupt
      handler between the read of the watchdog source and the read of the
      TSC brings the two far enough apart that the delta is larger than the
      unstable treshold. Move both reads into a short interrupt disabled
      region to avoid that.
      Reported-and-tested-by: NVernon Mauery <vernux@us.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org
      b5199515