1. 11 11月, 2011 1 次提交
    • J
      clocksource: Avoid selecting mult values that might overflow when adjusted · d65670a7
      John Stultz 提交于
      For some frequencies, the clocks_calc_mult_shift() function will
      unfortunately select mult values very close to 0xffffffff.  This
      has the potential to overflow when NTP adjusts the clock, adding
      to the mult value.
      
      This patch adds a clocksource.maxadj value, which provides
      an approximation of an 11% adjustment(NTP limits adjustments to
      500ppm and the tick adjustment is limited to 10%), which could
      be made to the clocksource.mult value. This is then used to both
      check that the current mult value won't overflow/underflow, as
      well as warning us if the timekeeping_adjust() code pushes over
      that 11% boundary.
      
      v2: Fix max_adjustment calculation, and improve WARN_ONCE
      messages.
      
      v3: Don't warn before maxadj has actually been set
      
      CC: Yong Zhang <yong.zhang0@gmail.com>
      CC: David Daney <ddaney.cavm@gmail.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Chen Jie <chenj@lemote.com>
      CC: zhangfx <zhangfx@lemote.com>
      CC: stable@kernel.org
      Reported-by: NChen Jie <chenj@lemote.com>
      Reported-by: Nzhangfx <zhangfx@lemote.com>
      Tested-by: NYong Zhang <yong.zhang0@gmail.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      d65670a7
  2. 28 10月, 2011 1 次提交
  3. 21 7月, 2011 1 次提交
    • J
      time: Fix stupid KERN_WARN compile issue · cbaa5152
      John Stultz 提交于
      Terribly embarassing. Don't know how I committed this, but its
      KERN_WARNING not KERN_WARN.
      
      This fixes the following compile error:
      kernel/time/timekeeping.c: In function ‘__timekeeping_inject_sleeptime’:
      kernel/time/timekeeping.c:608: error: ‘KERN_WARN’ undeclared (first use in this function)
      kernel/time/timekeeping.c:608: error: (Each undeclared identifier is reported only once
      kernel/time/timekeeping.c:608: error: for each function it appears in.)
      kernel/time/timekeeping.c:608: error: expected ‘)’ before string constant
      make[2]: *** [kernel/time/timekeeping.o] Error 1
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      cbaa5152
  4. 22 6月, 2011 2 次提交
    • J
      time: Avoid accumulating time drift in suspend/resume · cb33217b
      John Stultz 提交于
      Because the read_persistent_clock interface is usually backed by
      only a second granular interface, each time we read from the persistent
      clock for suspend/resume, we introduce a half second (on average) of error.
      
      In order to avoid this error accumulating as the system is suspended
      over and over, this patch measures the time delta between the persistent
      clock and the system CLOCK_REALTIME.
      
      If the delta is less then 2 seconds from the last suspend, we compensate
      by using the previous time delta (keeping it close). If it is larger
      then 2 seconds, we assume the clock was set or has been changed, so we
      do no correction and update the delta.
      
      Note: If NTP is running, ths could seem to "fight" with the NTP corrected
      time, where as if the system time was off by 1 second, and NTP slewed the
      value in, a suspend/resume cycle could undo this correction, by trying to
      restore the previous offset from the persistent clock.  However, without
      this patch, since each read could cause almost a full second worth of
      error, its possible to get almost 2 seconds of error just from the
      suspend/resume cycle alone, so this about equal to any offset added by
      the compensation.
      
      Further on systems that suspend/resume frequently, this should keep time
      closer then NTP could compensate for if the errors were allowed to
      accumulate.
      
      Credits to Arve Hjønnevåg for suggesting this solution.
      
      CC: Arve Hjønnevåg <arve@android.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      cb33217b
    • J
      time: Catch invalid timespec sleep values in __timekeeping_inject_sleeptime · cb5de2f8
      John Stultz 提交于
      Arve suggested making sure we catch possible negative sleep time
      intervals that could be passed into timekeeping_inject_sleeptime.
      
      CC: Arve Hjønnevåg <arve@android.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      cb5de2f8
  5. 03 5月, 2011 2 次提交
  6. 27 4月, 2011 1 次提交
    • J
      time: Add timekeeping_inject_sleeptime · 304529b1
      John Stultz 提交于
      Some platforms cannot implement read_persistent_clock, as
      their RTC devices are only accessible when interrupts are enabled.
      This keeps them from being used by the timekeeping code on resume
      to measure the time in suspend.
      
      The RTC layer tries to work around this, by calling do_settimeofday
      on resume after irqs are reenabled to set the time properly. However,
      this only corrects CLOCK_REALTIME, and does not properly adjust
      the sleep time value. This causes btime in /proc/stat to be incorrect
      as well as making the new CLOCK_BOTTTIME inaccurate.
      
      This patch resolves the issue by introducing a new timekeeping hook
      to allow the RTC layer to inject the sleep time on resume.
      
      The code also checks to make sure that read_persistent_clock is
      nonfunctional before setting the sleep time, so that should the RTC's
      HCTOSYS option be configured in on a system that does support
      read_persistent_clock we will not increase the total_sleep_time twice.
      
      CC: Arve Hjønnevåg <arve@android.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      304529b1
  7. 24 3月, 2011 1 次提交
  8. 22 2月, 2011 2 次提交
  9. 02 2月, 2011 2 次提交
  10. 31 1月, 2011 4 次提交
  11. 14 1月, 2011 1 次提交
  12. 12 1月, 2011 1 次提交
  13. 21 10月, 2010 1 次提交
  14. 14 8月, 2010 1 次提交
    • J
      time: Workaround gcc loop optimization that causes 64bit div errors · c7dcf87a
      John Stultz 提交于
      Early 4.3 versions of gcc apparently aggressively optimize the raw
      time accumulation loop, replacing it with a divide.
      
      On 32bit systems, this causes the following link errors:
      	undefined reference to `__umoddi3'
      	undefined reference to `__udivdi3'
      
      The gcc issue has been fixed in 4.4 and greater.
      
      This patch replaces the accumulation loop with a do_div, as suggested
      by Linus.
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      CC: Jason Wessel <jason.wessel@windriver.com>
      CC: Larry Finger <Larry.Finger@lwfinger.net>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7dcf87a
  15. 13 8月, 2010 1 次提交
    • J
      timekeeping: Fix overflow in rawtime tv_nsec on 32 bit archs · deda2e81
      Jason Wessel 提交于
      The tv_nsec is a long and when added to the shifted interval it can wrap
      and become negative which later causes looping problems in the
      getrawmonotonic().  The edge case occurs when the system has slept for
      a short period of time of ~2 seconds.
      
      A trace printk of the values in this patch illustrate the problem:
      
      ftrace time stamp: log
      43.716079: logarithmic_accumulation: raw: 3d0913 tv_nsec d687faa
      43.718513: logarithmic_accumulation: raw: 3d0913 tv_nsec da588bd
      43.722161: logarithmic_accumulation: raw: 3d0913 tv_nsec de291d0
      46.349925: logarithmic_accumulation: raw: 7a122600 tv_nsec e1f9ae3b
      46.349930: logarithmic_accumulation: raw: 1e848980 tv_nsec 8831c0e3
      
      The kernel starts looping at 46.349925 in the getrawmonotonic() due to
      the negative value from adding the raw value to tv_nsec.
      
      A simple solution is to accumulate into a u64, and then normalize it
      to a timespec_t.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
       [ Reworked variable names and simplified some of the code. - John ]
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      deda2e81
  16. 27 7月, 2010 5 次提交
  17. 13 4月, 2010 1 次提交
    • J
      time: Remove xtime_cache · 6a867a39
      John Stultz 提交于
      With the earlier logarithmic time accumulation patch, xtime will now
      always be within one "tick" of the current time, instead of possibly
      half a second off.
      
      This removes the need for the xtime_cache value, which always stored the
      time at the last interrupt, so this patch cleans that up removing the
      xtime_cache related code.
      
      This patch also addresses an issue with an earlier version of this change,
      where xtime_cache was normalizing xtime, which could in some cases be
      not valid (ie: tv_nsec == NSEC_PER_SEC). This is fixed by handling
      the edge case in update_wall_time().
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Cc: Petr Titěra <P.Titera@century.cz>
      LKML-Reference: <1270589451-30773-1-git-send-email-johnstul@us.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      6a867a39
  18. 23 3月, 2010 1 次提交
    • J
      time: Fix accumulation bug triggered by long delay. · 830ec045
      John Stultz 提交于
      The logarithmic accumulation done in the timekeeping has some overflow
      protection that limits the max shift value. That means it will take
      more then shift loops to accumulate all of the cycles. This causes
      the shift decrement to underflow, which causes the loop to never exit.
      
      The simplest fix would be simply to do a:
      	if (shift)
      		shift--;
      
      However that is not optimal, as we know the cycle offset is larger
      then the interval << shift, the above would make shift drop to zero,
      then we would be spinning for quite awhile accumulating at interval
      chunks at a time.
      
      Instead, this patch only decreases shift if the offset is smaller
      then cycle_interval << shift.  This makes sure we accumulate using
      the largest chunks possible without overflowing tick_length, and limits
      the number of iterations through the loop.
      
      This issue was found and reported by Sonic Zhang, who also tested the fix.
      Many thanks your explanation and testing!
      Reported-by: NSonic Zhang <sonic.adi@gmail.com>
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Tested-by: NSonic Zhang <sonic.adi@gmail.com>
      LKML-Reference: <1268948850-5225-1-git-send-email-johnstul@us.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      830ec045
  19. 10 2月, 2010 1 次提交
  20. 05 2月, 2010 1 次提交
    • M
      clocksource: add suspend callback · c54a42b1
      Magnus Damm 提交于
      Add a clocksource suspend callback.  This callback can be used by the
      clocksource driver to shutdown and perform any kind of late suspend
      activities even though the clocksource driver itself is a non-sysdev
      driver.
      
      One example where this is useful is to fix the sh_cmt.c platform driver
      that today suspends using the platform bus and shuts down the clocksource
      too early.
      
      With this callback in place the sh_cmt driver will suspend using the
      clocksource and clockevent hooks and leave the platform device pm
      callbacks unused.
      Signed-off-by: NMagnus Damm <damm@opensource.se>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: john stultz <johnstul@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      c54a42b1
  21. 23 12月, 2009 1 次提交
    • L
      Revert "time: Remove xtime_cache" · 83f57a11
      Linus Torvalds 提交于
      This reverts commit 7bc7d637, as
      requested by John Stultz. Quoting John:
      
       "Petr Titěra reported an issue where he saw odd atime regressions with
        2.6.33 where there were a full second worth of nanoseconds in the
        nanoseconds field.
      
        He also reviewed the time code and narrowed down the problem: unhandled
        overflow of the nanosecond field caused by rounding up the
        sub-nanosecond accumulated time.
      
        Details:
      
         * At the end of update_wall_time(), we currently round up the
        sub-nanosecond portion of accumulated time when storing it into xtime.
        This was added to avoid time inconsistencies caused when the
        sub-nanosecond portion was truncated when storing into xtime.
        Unfortunately we don't handle the possible second overflow caused by
        that rounding.
      
         * Previously the xtime_cache code hid this overflow by normalizing the
        xtime value when storing into the xtime_cache.
      
         * We could try to handle the second overflow after the rounding up, but
        since this affects the timekeeping's internal state, this would further
        complicate the next accumulation cycle, causing small errors in ntp
        steering. As much as I'd like to get rid of it, the xtime_cache code is
        known to work.
      
         * The correct fix is really to include the sub-nanosecond portion in the
        timekeeping accessor function, so we don't need to round up at during
        accumulation. This would greatly simplify the accumulation code.
        Unfortunately, we can't do this safely until the last three
        non-GENERIC_TIME arches (sparc32, arm, cris) are converted  (those
        patches are in -mm) and we kill off the spots where arches set xtime
        directly. This is all 2.6.34 material, so I think reverting the
        xtime_cache change is the best approach for now.
      
        Many thanks to Petr for both reporting and finding the issue!"
      Reported-by: NPetr Titěra <P.Titera@century.cz>
      Requested-by: Njohn stultz <johnstul@us.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83f57a11
  22. 17 11月, 2009 1 次提交
    • L
      timekeeping: Fix clock_gettime vsyscall time warp · 0696b711
      Lin Ming 提交于
      Since commit 0a544198 "timekeeping: Move NTP adjusted clock multiplier
      to struct timekeeper" the clock multiplier of vsyscall is updated with
      the unmodified clock multiplier of the clock source and not with the
      NTP adjusted multiplier of the timekeeper.
      
      This causes user space observerable time warps:
      new CLOCK-warp maximum: 120 nsecs,  00000025c337c537 -> 00000025c337c4bf
      
      Add a new argument "mult" to update_vsyscall() and hand in the
      timekeeping internal NTP adjusted multiplier.
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: "Zhang Yanmin" <yanmin_zhang@linux.intel.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Tony Luck <tony.luck@intel.com>
      LKML-Reference: <1258436990.17765.83.camel@minggr.sh.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      0696b711
  23. 14 11月, 2009 1 次提交
    • J
      nohz: Prevent clocksource wrapping during idle · 98962465
      Jon Hunter 提交于
      The dynamic tick allows the kernel to sleep for periods longer than a
      single tick, but it does not limit the sleep time currently. In the
      worst case the kernel could sleep longer than the wrap around time of
      the time keeping clock source which would result in losing track of
      time.
      
      Prevent this by limiting it to the safe maximum sleep time of the
      current time keeping clock source. The value is calculated when the
      clock source is registered.
      
      [ tglx: simplified the code a bit and massaged the commit msg ]
      Signed-off-by: NJon Hunter <jon-hunter@ti.com>
      Cc: John Stultz <johnstul@us.ibm.com>
      LKML-Reference: <1250617512-23567-2-git-send-email-jon-hunter@ti.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      98962465
  24. 12 10月, 2009 1 次提交
  25. 05 10月, 2009 2 次提交
    • J
      time: Remove xtime_cache · 7bc7d637
      john stultz 提交于
      With the prior logarithmic time accumulation patch, xtime will now
      always be within one "tick" of the current time, instead of
      possibly half a second off.
      
      This removes the need for the xtime_cache value, which always
      stored the time at the last interrupt, so this patch cleans that up
      removing the xtime_cache related code.
      
      This is a bit simpler, but still could use some wider testing.
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJohn Kacur <jkacur@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <1254525855.7741.95.camel@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7bc7d637
    • J
      time: Implement logarithmic time accumulation · a092ff0f
      john stultz 提交于
      Accumulating one tick at a time works well unless we're using NOHZ.
      Then it can be an issue, since we may have to run through the loop
      a few thousand times, which can increase timer interrupt caused
      latency.
      
      The current solution was to accumulate in half-second intervals
      with NOHZ. This kept the number of loops down, however it did
      slightly change how we make NTP adjustments. While not an issue
      with NTPd users, as NTPd makes adjustments over a longer period of
      time, other adjtimex() users have noticed the half-second
      granularity with which we can apply frequency changes to the clock.
      
      For instance, if a application tries to apply a 100ppm frequency
      correction for 20ms to correct a 2us offset, with NOHZ they either
      get no correction, or a 50us correction.
      
      Now, there will always be some granularity error for applying
      frequency corrections. However with users sensitive to this error
      have seen a 50-500x increase with NOHZ compared to running without
      NOHZ.
      
      So I figured I'd try another approach then just simply increasing
      the interval. My approach is to consume the time interval
      logarithmically. This reduces the number of times through the loop
      needed keeping latency down, while still preserving the original
      granularity error for adjtimex() changes.
      
      Further, this change allows us to remove the xtime_cache code
      (patch to follow), as xtime is always within one tick of the
      current time, instead of the half-second updates it saw before.
      
      An earlier version of this patch has been shipping to x86 users in
      the RedHat MRG releases for awhile without issue, but I've reworked
      this version to be even more careful about avoiding possible
      overflows if the shift value gets too large.
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJohn Kacur <jkacur@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <1254525473.7741.88.camel@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a092ff0f
  26. 25 8月, 2009 1 次提交
  27. 22 8月, 2009 1 次提交
    • J
      time: Introduce CLOCK_REALTIME_COARSE · da15cfda
      john stultz 提交于
      After talking with some application writers who want very fast, but not
      fine-grained timestamps, I decided to try to implement new clock_ids
      to clock_gettime(): CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE
      which returns the time at the last tick. This is very fast as we don't
      have to access any hardware (which can be very painful if you're using
      something like the acpi_pm clocksource), and we can even use the vdso
      clock_gettime() method to avoid the syscall. The only trade off is you
      only get low-res tick grained time resolution.
      
      This isn't a new idea, I know Ingo has a patch in the -rt tree that made
      the vsyscall gettimeofday() return coarse grained time when the
      vsyscall64 sysctrl was set to 2. However this affects all applications
      on a system.
      
      With this method, applications can choose the proper speed/granularity
      trade-off for themselves.
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: nikolag@ca.ibm.com
      Cc: Darren Hart <dvhltc@us.ibm.com>
      Cc: arjan@infradead.org
      Cc: jonathan@jonmasters.org
      LKML-Reference: <1250734414.6897.5.camel@localhost.localdomain>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      da15cfda
  28. 15 8月, 2009 1 次提交