1. 03 3月, 2016 1 次提交
    • C
      time: Add cycles to nanoseconds translation · 6bd58f09
      Christopher S. Hall 提交于
      The timekeeping code does not currently provide a way to translate
      externally provided clocksource cycles to system time. The cycle count
      is always provided by the result clocksource read() method internal to
      the timekeeping code. The added function timekeeping_cycles_to_ns()
      calculated a nanosecond value from a cycle count that can be added to
      tk_read_base.base value yielding the current system time. This allows
      clocksource cycle values external to the timekeeping code to provide a
      cycle count that can be transformed to system time.
      
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: kevin.b.stanton@intel.com
      Cc: kevin.j.clarke@intel.com
      Cc: hpa@zytor.com
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristopher S. Hall <christopher.s.hall@intel.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      6bd58f09
  2. 27 2月, 2016 1 次提交
  3. 27 1月, 2016 3 次提交
  4. 16 1月, 2016 1 次提交
  5. 29 12月, 2015 1 次提交
  6. 19 12月, 2015 1 次提交
  7. 17 12月, 2015 4 次提交
    • J
      timekeeping: Cap adjustments so they don't exceed the maxadj value · ec02b076
      John Stultz 提交于
      Thus its been occasionally noted that users have seen
      confusing warnings like:
      
          Adjusting tsc more than 11% (5941981 vs 7759439)
      
      We try to limit the maximum total adjustment to 11% (10% tick
      adjustment + 0.5% frequency adjustment). But this is done by
      bounding the requested adjustment values, and the internal
      steering that is done by tracking the error from what was
      requested and what was applied, does not have any such limits.
      
      This is usually not problematic, but in some cases has a risk
      that an adjustment could cause the clocksource mult value to
      overflow, so its an indication things are outside of what is
      expected.
      
      It ends up most of the reports of this 11% warning are on systems
      using chrony, which utilizes the adjtimex() ADJ_TICK interface
      (which allows a +-10% adjustment). The original rational for
      ADJ_TICK unclear to me but my assumption it was originally added
      to allow broken systems to get a big constant correction at boot
      (see adjtimex userspace package for an example) which would allow
      the system to work w/ ntpd's 0.5% adjustment limit.
      
      Chrony uses ADJ_TICK to make very aggressive short term corrections
      (usually right at startup). Which push us close enough to the max
      bound that a few late ticks can cause the internal steering to push
      past the max adjust value (tripping the warning).
      
      Thus this patch adds some extra logic to enforce the max adjustment
      cap in the internal steering.
      
      Note: This has the potential to slow corrections when the ADJ_TICK
      value is furthest away from the default value. So it would be good to
      get some testing from folks using chrony, to make sure we don't
      cause any troubles there.
      
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Tested-by: NMiroslav Lichvar <mlichvar@redhat.com>
      Reported-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      ec02b076
    • D
      ntp: Fix second_overflow's input parameter type to be 64bits · c7963487
      DengChao 提交于
      The function "second_overflow" uses "unsign long"
      as its input parameter type which will overflow after
      year 2106 on 32bit systems.
      
      Thus this patch replaces it with time64_t type.
      
      While the 64-bit division is expensive, "next_ntp_leap_sec"
      has been calculated already, so we can just re-use it in the
      TIME_INS/DEL cases, allowing one expensive division per
      leapsecond instead of re-doing the divsion once a second after
      the leap flag has been set.
      Signed-off-by: NDengChao <chao.deng@linaro.org>
      [jstultz: Tweaked commit message]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      c7963487
    • D
      ntp: Change time_reftime to time64_t and utilize 64bit __ktime_get_real_seconds · 0af86465
      DengChao 提交于
      The type of static variant "time_reftime" and the call of
      get_seconds in ntp are both not y2038 safe.
      
      So change the type of time_reftime to time64_t and replace
      get_seconds with __ktime_get_real_seconds.
      
      The local variant "secs" in ntp_update_offset represents
      seconds between now and last ntp adjustment, it seems impossible
      that this time will last more than 68 years, so keep its type as
      "long".
      Reviewed-by: NJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: NDengChao <chao.deng@linaro.org>
      [jstultz: Tweaked commit message]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      0af86465
    • D
      timekeeping: Provide internal function __ktime_get_real_seconds · dee36654
      DengChao 提交于
      In order to fix Y2038 issues in the ntp code we will need replace
      get_seconds() with ktime_get_real_seconds() but as the ntp code uses
      the timekeeping lock which is also used by ktime_get_real_seconds(),
      we need a version without locking.
      Add a new function __ktime_get_real_seconds() in timekeeping to
      do this.
      Reviewed-by: NJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: NDengChao <chao.deng@linaro.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      dee36654
  8. 11 12月, 2015 2 次提交
    • J
      time: Verify time values in adjtimex ADJ_SETOFFSET to avoid overflow · 37cf4dc3
      John Stultz 提交于
      For adjtimex()'s ADJ_SETOFFSET, make sure the tv_usec value is
      sane. We might multiply them later which can cause an overflow
      and undefined behavior.
      
      This patch introduces new helper functions to simplify the
      checking code and adds comments to clarify
      
      Orginally this patch was by Sasha Levin, but I've basically
      rewritten it, so he should get credit for finding the issue
      and I should get the blame for any mistakes made since.
      
      Also, credit to Richard Cochran for the phrasing used in the
      comment for what is considered valid here.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      37cf4dc3
    • S
      ntp: Verify offset doesn't overflow in ntp_update_offset · 52d189f1
      Sasha Levin 提交于
      We need to make sure that the offset is valid before manipulating it,
      otherwise it might overflow on the multiplication.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      [jstultz: Reworked one of the checks so it makes more sense]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      52d189f1
  9. 09 12月, 2015 1 次提交
    • T
      watchdog: introduce touch_softlockup_watchdog_sched() · 03e0d461
      Tejun Heo 提交于
      touch_softlockup_watchdog() is used to tell watchdog that scheduler
      stall is expected.  One group of usage is from paths where the task
      may not be able to yield for a long time such as performing slow PIO
      to finicky device and coming out of suspend.  The other is to account
      for scheduler and timer going idle.
      
      For scheduler softlockup detection, there's no reason to distinguish
      the two cases; however, workqueue lockup detector is planned and it
      can use the same signals from the former group while the latter would
      spuriously prevent detection.  This patch introduces a new function
      touch_softlockup_watchdog_sched() and convert the latter group to call
      it instead.  For now, it just calls touch_softlockup_watchdog() and
      there's no functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      03e0d461
  10. 08 12月, 2015 2 次提交
    • S
      clocksource: Add CPU info to clocksource watchdog reporting · 390dd67c
      Seiichi Ikarashi 提交于
      The clocksource watchdog reporting was improved by 0b046b21.
      I want to add the info of CPU where the watchdog detects a
      deviation because it is necessary to identify the trouble spot
      if the clocksource is TSC.
      Signed-off-by: NSeiichi Ikarashi <s.ikarashi@jp.fujitsu.com>
      [jstultz: Tweaked commit message]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      390dd67c
    • D
      time: Avoid signed overflow in timekeeping_get_ns() · 35a4933a
      David Gibson 提交于
      1e75fa8b "time: Condense timekeeper.xtime into xtime_sec" replaced a call to
      clocksource_cyc2ns() from timekeeping_get_ns() with an open-coded version
      of the same logic to avoid keeping a semi-redundant struct timespec
      in struct timekeeper.
      
      However, the commit also introduced a subtle semantic change - where
      clocksource_cyc2ns() uses purely unsigned math, the new version introduces
      a signed temporary, meaning that if (delta * tk->mult) has a 63-bit
      overflow the following shift will still give a negative result.  The
      choice of 'maxsec' in __clocksource_updatefreq_scale() means this will
      generally happen if there's a ~10 minute pause in examining the
      clocksource.
      
      This can be triggered on a powerpc KVM guest by stopping it from qemu for
      a bit over 10 minutes.  After resuming time has jumped backwards several
      minutes causing numerous problems (jiffies does not advance, msleep()s can
      be extended by minutes..).  It doesn't happen on x86 KVM guests, because
      the guest TSC is effectively frozen while the guest is stopped, which is
      not the case for the powerpc timebase.
      
      Obviously an unsigned (64 bit) overflow will only take twice as long as a
      signed, 63-bit overflow.  I don't know the time code well enough to know
      if that will still cause incorrect calculations, or if a 64-bit overflow
      is avoided elsewhere.
      
      Still, an incorrect forwards clock adjustment will cause less trouble than
      time going backwards.  So, this patch removes the potential for
      intermediate signed overflow.
      
      Cc: stable@vger.kernel.org  (3.7+)
      Suggested-by: NLaurent Vivier <lvivier@redhat.com>
      Tested-by: NLaurent Vivier <lvivier@redhat.com>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      35a4933a
  11. 04 12月, 2015 2 次提交
  12. 26 11月, 2015 1 次提交
  13. 23 11月, 2015 1 次提交
  14. 10 11月, 2015 1 次提交
    • A
      remove abs64() · 79211c8e
      Andrew Morton 提交于
      Switch everything to the new and more capable implementation of abs().
      Mainly to give the new abs() a bit of a workout.
      
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79211c8e
  15. 05 11月, 2015 1 次提交
    • T
      timers: Use proper base migration in add_timer_on() · 22b886dd
      Tejun Heo 提交于
      Regardless of the previous CPU a timer was on, add_timer_on()
      currently simply sets timer->flags to the new CPU.  As the caller must
      be seeing the timer as idle, this is locally fine, but the timer
      leaving the old base while unlocked can lead to race conditions as
      follows.
      
      Let's say timer was on cpu 0.
      
        cpu 0					cpu 1
        -----------------------------------------------------------------------------
        del_timer(timer) succeeds
      					del_timer(timer)
      					  lock_timer_base(timer) locks cpu_0_base
        add_timer_on(timer, 1)
          spin_lock(&cpu_1_base->lock)
          timer->flags set to cpu_1_base
          operates on @timer			  operates on @timer
      
      This triggered with mod_delayed_work_on() which contains
      "if (del_timer()) add_timer_on()" sequence eventually leading to the
      following oops.
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: [<ffffffff810ca6e9>] detach_if_pending+0x69/0x1a0
        ...
        Workqueue: wqthrash wqthrash_workfunc [wqthrash]
        task: ffff8800172ca680 ti: ffff8800172d0000 task.ti: ffff8800172d0000
        RIP: 0010:[<ffffffff810ca6e9>]  [<ffffffff810ca6e9>] detach_if_pending+0x69/0x1a0
        ...
        Call Trace:
         [<ffffffff810cb0b4>] del_timer+0x44/0x60
         [<ffffffff8106e836>] try_to_grab_pending+0xb6/0x160
         [<ffffffff8106e913>] mod_delayed_work_on+0x33/0x80
         [<ffffffffa0000081>] wqthrash_workfunc+0x61/0x90 [wqthrash]
         [<ffffffff8106dba8>] process_one_work+0x1e8/0x650
         [<ffffffff8106e05e>] worker_thread+0x4e/0x450
         [<ffffffff810746af>] kthread+0xef/0x110
         [<ffffffff8185980f>] ret_from_fork+0x3f/0x70
      
      Fix it by updating add_timer_on() to perform proper migration as
      __mod_timer() does.
      Reported-and-tested-by: NJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Chris Worley <chris.worley@primarydata.com>
      Cc: bfields@fieldses.org
      Cc: Michael Skralivetsky <michael.skralivetsky@primarydata.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Jeff Layton <jlayton@poochiereds.net>
      Cc: kernel-team@fb.com
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20151029103113.2f893924@tlielax.poochiereds.net
      Link: http://lkml.kernel.org/r/20151104171533.GI5749@mtj.duckdns.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      22b886dd
  16. 26 10月, 2015 1 次提交
  17. 16 10月, 2015 1 次提交
  18. 15 10月, 2015 4 次提交
  19. 12 10月, 2015 2 次提交
  20. 03 10月, 2015 1 次提交
  21. 02 10月, 2015 3 次提交
  22. 22 9月, 2015 2 次提交
  23. 14 9月, 2015 1 次提交
  24. 13 9月, 2015 1 次提交
    • J
      time: Fix timekeeping_freqadjust()'s incorrect use of abs() instead of abs64() · 2619d7e9
      John Stultz 提交于
      The internal clocksteering done for fine-grained error
      correction uses a logarithmic approximation, so any time
      adjtimex() adjusts the clock steering, timekeeping_freqadjust()
      quickly approximates the correct clock frequency over a series
      of ticks.
      
      Unfortunately, the logic in timekeeping_freqadjust(), introduced
      in commit:
      
        dc491596 ("timekeeping: Rework frequency adjustments to work better w/ nohz")
      
      used the abs() function with a s64 error value to calculate the
      size of the approximated adjustment to be made.
      
      Per include/linux/kernel.h:
      
        "abs() should not be used for 64-bit types (s64, u64, long long) - use abs64()".
      
      Thus on 32-bit platforms, this resulted in the clocksteering to
      take a quite dampended random walk trying to converge on the
      proper frequency, which caused the adjustments to be made much
      slower then intended (most easily observed when large
      adjustments are made).
      
      This patch fixes the issue by using abs64() instead.
      Reported-by: NNuno Gonçalves <nunojpg@gmail.com>
      Tested-by: NNuno Goncalves <nunojpg@gmail.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: <stable@vger.kernel.org> # v3.17+
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1441840051-20244-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2619d7e9
  25. 02 9月, 2015 1 次提交