1. 10 10月, 2017 1 次提交
  2. 09 9月, 2017 1 次提交
  3. 01 9月, 2017 1 次提交
  4. 26 8月, 2017 1 次提交
    • J
      time: Fix ktime_get_raw() incorrect base accumulation · 0bcdc098
      John Stultz 提交于
      In comqit fc6eead7 ("time: Clean up CLOCK_MONOTONIC_RAW time
      handling"), the following code got mistakenly added to the update of the
      raw timekeeper:
      
       /* Update the monotonic raw base */
       seconds = tk->raw_sec;
       nsec = (u32)(tk->tkr_raw.xtime_nsec >> tk->tkr_raw.shift);
       tk->tkr_raw.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);
      
      Which adds the raw_sec value and the shifted down raw xtime_nsec to the
      base value.
      
      But the read function adds the shifted down tk->tkr_raw.xtime_nsec value
      another time, The result of this is that ktime_get_raw() users (which are
      all internal users) see the raw time move faster then it should (the rate
      at which can vary with the current size of tkr_raw.xtime_nsec), which has
      resulted in at least problems with graphics rendering performance.
      
      The change tried to match the monotonic base update logic:
      
       seconds = (u64)(tk->xtime_sec + tk->wall_to_monotonic.tv_sec);
       nsec = (u32) tk->wall_to_monotonic.tv_nsec;
       tk->tkr_mono.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);
      
      Which adds the wall_to_monotonic.tv_nsec value, but not the
      tk->tkr_mono.xtime_nsec value to the base.
      
      To fix this, simplify the tkr_raw.base accumulation to only accumulate the
      raw_sec portion, and do not include the tkr_raw.xtime_nsec portion, which
      will be added at read time.
      
      Fixes: fc6eead7 ("time: Clean up CLOCK_MONOTONIC_RAW time handling")
      Reported-and-tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Daniel Mentz <danielmentz@google.com>
      Link: http://lkml.kernel.org/r/1503701824-1645-1-git-send-email-john.stultz@linaro.org
      0bcdc098
  5. 24 8月, 2017 1 次提交
    • N
      timers: Fix excessive granularity of new timers after a nohz idle · 2fe59f50
      Nicholas Piggin 提交于
      When a timer base is idle, it is forwarded when a new timer is added
      to ensure that granularity does not become excessive. When not idle,
      the timer tick is expected to increment the base.
      
      However there are several problems:
      
      - If an existing timer is modified, the base is forwarded only after
        the index is calculated.
      
      - The base is not forwarded by add_timer_on.
      
      - There is a window after a timer is restarted from a nohz idle, after
        it is marked not-idle and before the timer tick on this CPU, where a
        timer may be added but the ancient base does not get forwarded.
      
      These result in excessive granularity (a 1 jiffy timeout can blow out
      to 100s of jiffies), which cause the rcu lockup detector to trigger,
      among other things.
      
      Fix this by keeping track of whether the timer base has been idle
      since it was last run or forwarded, and if so then forward it before
      adding a new timer.
      
      There is still a case where mod_timer optimises the case of a pending
      timer mod with the same expiry time, where the timer can see excessive
      granularity relative to the new, shorter interval. A comment is added,
      but it's not changed because it is an important fastpath for
      networking.
      
      This has been tested and found to fix the RCU softlockup messages.
      
      Testing was also done with tracing to measure requested versus
      achieved wakeup latencies for all non-deferrable timers in an idle
      system (with no lockup watchdogs running). Wakeup latency relative to
      absolute latency is calculated (note this suffers from round-up skew
      at low absolute times) and analysed:
      
                   max     avg      std
      upstream   506.0    1.20     4.68
      patched      2.0    1.08     0.15
      
      The bug was noticed due to the lockup detector Kconfig changes
      dropping it out of people's .configs and resulting in larger base
      clk skew When the lockup detectors are enabled, no CPU can go idle for
      longer than 4 seconds, which limits the granularity errors.
      Sub-optimal timer behaviour is observable on a smaller scale in that
      case:
      
      	     max     avg      std
      upstream     9.0    1.05     0.19
      patched      2.0    1.04     0.11
      
      Fixes: Fixes: a683f390 ("timers: Forward the wheel clock whenever possible")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
      Tested-by: NDavid Miller <davem@davemloft.net>
      Cc: dzickus@redhat.com
      Cc: sfr@canb.auug.org.au
      Cc: mpe@ellerman.id.au
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: linuxarm@huawei.com
      Cc: abdhalee@linux.vnet.ibm.com
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: akpm@linux-foundation.org
      Cc: paulmck@linux.vnet.ibm.com
      Cc: torvalds@linux-foundation.org
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com
      2fe59f50
  6. 18 8月, 2017 3 次提交
  7. 01 8月, 2017 1 次提交
  8. 23 7月, 2017 1 次提交
    • R
      PM / timekeeping: Print debug messages when requested · cb08e035
      Rafael J. Wysocki 提交于
      The messages printed by tk_debug_account_sleep_time() are basically
      useful for system sleep debugging, so print them only when the other
      debug messages from the core suspend/hibernate code are enabled.
      
      While at it, make it clear that the messages from
      tk_debug_account_sleep_time() are about timekeeping suspend
      duration, because in general timekeeping may be suspeded and
      resumed for multiple times during one system suspend-resume cycle.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      cb08e035
  9. 30 6月, 2017 3 次提交
  10. 29 6月, 2017 1 次提交
  11. 26 6月, 2017 3 次提交
  12. 22 6月, 2017 2 次提交
  13. 21 6月, 2017 5 次提交
  14. 20 6月, 2017 2 次提交
    • J
      time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting · 3d88d56c
      John Stultz 提交于
      Due to how the MONOTONIC_RAW accumulation logic was handled,
      there is the potential for a 1ns discontinuity when we do
      accumulations. This small discontinuity has for the most part
      gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW
      in their vDSO clock_gettime implementation, we've seen failures
      with the inconsistency-check test in kselftest.
      
      This patch addresses the issue by using the same sub-ns
      accumulation handling that CLOCK_MONOTONIC uses, which avoids
      the issue for in-kernel users.
      
      Since the ARM64 vDSO implementation has its own clock_gettime
      calculation logic, this patch reduces the frequency of errors,
      but failures are still seen. The ARM64 vDSO will need to be
      updated to include the sub-nanosecond xtime_nsec values in its
      calculation for this issue to be completely fixed.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Tested-by: NDaniel Mentz <danielmentz@google.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: "stable #4 . 8+" <stable@vger.kernel.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      3d88d56c
    • J
      time: Fix clock->read(clock) race around clocksource changes · ceea5e37
      John Stultz 提交于
      In tests, which excercise switching of clocksources, a NULL
      pointer dereference can be observed on AMR64 platforms in the
      clocksource read() function:
      
      u64 clocksource_mmio_readl_down(struct clocksource *c)
      {
      	return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
      }
      
      This is called from the core timekeeping code via:
      
      	cycle_now = tkr->read(tkr->clock);
      
      tkr->read is the cached tkr->clock->read() function pointer.
      When the clocksource is changed then tkr->clock and tkr->read
      are updated sequentially. The code above results in a sequential
      load operation of tkr->read and tkr->clock as well.
      
      If the store to tkr->clock hits between the loads of tkr->read
      and tkr->clock, then the old read() function is called with the
      new clock pointer. As a consequence the read() function
      dereferences a different data structure and the resulting 'reg'
      pointer can point anywhere including NULL.
      
      This problem was introduced when the timekeeping code was
      switched over to use struct tk_read_base. Before that, it was
      theoretically possible as well when the compiler decided to
      reload clock in the code sequence:
      
           now = tk->clock->read(tk->clock);
      
      Add a helper function which avoids the issue by reading
      tk_read_base->clock once into a local variable clk and then issue
      the read function via clk->read(clk). This guarantees that the
      read() function always gets the proper clocksource pointer handed
      in.
      
      Since there is now no use for the tkr.read pointer, this patch
      also removes it, and to address stopping the fast timekeeper
      during suspend/resume, it introduces a dummy clocksource to use
      rather then just a dummy read function.
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: stable <stable@vger.kernel.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Daniel Mentz <danielmentz@google.com>
      Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      ceea5e37
  15. 14 6月, 2017 14 次提交