1. 24 12月, 2013 4 次提交
    • J
      tick/timekeeping: Call update_wall_time outside the jiffies lock · 47a1b796
      John Stultz 提交于
      Since the xtime lock was split into the timekeeping lock and
      the jiffies lock, we no longer need to call update_wall_time()
      while holding the jiffies lock.
      
      Thus, this patch splits update_wall_time() out from do_timer().
      
      This allows us to get away from calling clock_was_set_delayed()
      in update_wall_time() and instead use the standard clock_was_set()
      call that previously would deadlock, as it causes the jiffies lock
      to be acquired.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      47a1b796
    • J
      timekeeping: Avoid possible deadlock from clock_was_set_delayed · 6fdda9a9
      John Stultz 提交于
      As part of normal operaions, the hrtimer subsystem frequently calls
      into the timekeeping code, creating a locking order of
        hrtimer locks -> timekeeping locks
      
      clock_was_set_delayed() was suppoed to allow us to avoid deadlocks
      between the timekeeping the hrtimer subsystem, so that we could
      notify the hrtimer subsytem the time had changed while holding
      the timekeeping locks. This was done by scheduling delayed work
      that would run later once we were out of the timekeeing code.
      
      But unfortunately the lock chains are complex enoguh that in
      scheduling delayed work, we end up eventually trying to grab
      an hrtimer lock.
      
      Sasha Levin noticed this in testing when the new seqlock lockdep
      enablement triggered the following (somewhat abrieviated) message:
      
      [  251.100221] ======================================================
      [  251.100221] [ INFO: possible circular locking dependency detected ]
      [  251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 Not tainted
      [  251.101967] -------------------------------------------------------
      [  251.101967] kworker/10:1/4506 is trying to acquire lock:
      [  251.101967]  (timekeeper_seq){----..}, at: [<ffffffff81160e96>] retrigger_next_event+0x56/0x70
      [  251.101967]
      [  251.101967] but task is already holding lock:
      [  251.101967]  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
      [  251.101967]
      [  251.101967] which lock already depends on the new lock.
      [  251.101967]
      [  251.101967]
      [  251.101967] the existing dependency chain (in reverse order) is:
      [  251.101967]
      -> #5 (hrtimer_bases.lock#11){-.-...}:
      [snipped]
      -> #4 (&rt_b->rt_runtime_lock){-.-...}:
      [snipped]
      -> #3 (&rq->lock){-.-.-.}:
      [snipped]
      -> #2 (&p->pi_lock){-.-.-.}:
      [snipped]
      -> #1 (&(&pool->lock)->rlock){-.-...}:
      [  251.101967]        [<ffffffff81194803>] validate_chain+0x6c3/0x7b0
      [  251.101967]        [<ffffffff81194d9d>] __lock_acquire+0x4ad/0x580
      [  251.101967]        [<ffffffff81194ff2>] lock_acquire+0x182/0x1d0
      [  251.101967]        [<ffffffff84398500>] _raw_spin_lock+0x40/0x80
      [  251.101967]        [<ffffffff81153e69>] __queue_work+0x1a9/0x3f0
      [  251.101967]        [<ffffffff81154168>] queue_work_on+0x98/0x120
      [  251.101967]        [<ffffffff81161351>] clock_was_set_delayed+0x21/0x30
      [  251.101967]        [<ffffffff811c4bd1>] do_adjtimex+0x111/0x160
      [  251.101967]        [<ffffffff811e2711>] compat_sys_adjtimex+0x41/0x70
      [  251.101967]        [<ffffffff843a4b49>] ia32_sysret+0x0/0x5
      [  251.101967]
      -> #0 (timekeeper_seq){----..}:
      [snipped]
      [  251.101967] other info that might help us debug this:
      [  251.101967]
      [  251.101967] Chain exists of:
        timekeeper_seq --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock#11
      
      [  251.101967]  Possible unsafe locking scenario:
      [  251.101967]
      [  251.101967]        CPU0                    CPU1
      [  251.101967]        ----                    ----
      [  251.101967]   lock(hrtimer_bases.lock#11);
      [  251.101967]                                lock(&rt_b->rt_runtime_lock);
      [  251.101967]                                lock(hrtimer_bases.lock#11);
      [  251.101967]   lock(timekeeper_seq);
      [  251.101967]
      [  251.101967]  *** DEADLOCK ***
      [  251.101967]
      [  251.101967] 3 locks held by kworker/10:1/4506:
      [  251.101967]  #0:  (events){.+.+.+}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
      [  251.101967]  #1:  (hrtimer_work){+.+...}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
      [  251.101967]  #2:  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
      [  251.101967]
      [  251.101967] stack backtrace:
      [  251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053
      [  251.101967] Workqueue: events clock_was_set_work
      
      So the best solution is to avoid calling clock_was_set_delayed() while
      holding the timekeeping lock, and instead using a flag variable to
      decide if we should call clock_was_set() once we've released the locks.
      
      This works for the case here, where the do_adjtimex() was the deadlock
      trigger point. Unfortuantely, in update_wall_time() we still hold
      the jiffies lock, which would deadlock with the ipi triggered by
      clock_was_set(), preventing us from calling it even after we drop the
      timekeeping lock. So instead call clock_was_set_delayed() at that point.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: stable <stable@vger.kernel.org> #3.10+
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Tested-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      6fdda9a9
    • J
      timekeeping: Fix potential lost pv notification of time change · 5258d3f2
      John Stultz 提交于
      In 780427f0 (Indicate that clock was set in the pvclock
      gtod notifier), logic was added to pass a CLOCK_WAS_SET
      notification to the pvclock notifier chain.
      
      While that patch added a action flag returned from
      accumulate_nsecs_to_secs(), it only uses the returned value
      in one location, and not in the logarithmic accumulation.
      
      This means if a leap second triggered during the logarithmic
      accumulation (which is most likely where it would happen),
      the notification that the clock was set would not make it to
      the pv notifiers.
      
      This patch extends the logarithmic_accumulation pass down
      that action flag so proper notification will occur.
      
      This patch also changes the varialbe action -> clock_set
      per Ingo's suggestion.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: <xen-devel@lists.xen.org>
      Cc: stable <stable@vger.kernel.org> #3.11+
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      5258d3f2
    • J
      timekeeping: Fix lost updates to tai adjustment · f55c0760
      John Stultz 提交于
      Since 48cdc135 (Implement a shadow timekeeper), we have to
      call timekeeping_update() after any adjustment to the timekeeping
      structure in order to make sure that any adjustments to the structure
      persist.
      
      Unfortunately, the updates to the tai offset via adjtimex do not
      trigger this update, causing adjustments to the tai offset to be
      made and then over-written by the previous value at the next
      update_wall_time() call.
      
      This patch resovles the issue by calling timekeeping_update()
      right after setting the tai offset.
      
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable <stable@vger.kernel.org> #3.10+
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      f55c0760
  2. 19 10月, 2013 1 次提交
  3. 18 10月, 2013 1 次提交
  4. 12 9月, 2013 1 次提交
    • J
      timekeeping: Fix HRTICK related deadlock from ntp lock changes · 7bd36014
      John Stultz 提交于
      Gerlando Falauto reported that when HRTICK is enabled, it is
      possible to trigger system deadlocks. These were hard to
      reproduce, as HRTICK has been broken in the past, but seemed
      to be connected to the timekeeping_seq lock.
      
      Since seqlock/seqcount's aren't supported w/ lockdep, I added
      some extra spinlock based locking and triggered the following
      lockdep output:
      
      [   15.849182] ntpd/4062 is trying to acquire lock:
      [   15.849765]  (&(&pool->lock)->rlock){..-...}, at: [<ffffffff810aa9b5>] __queue_work+0x145/0x480
      [   15.850051]
      [   15.850051] but task is already holding lock:
      [   15.850051]  (timekeeper_lock){-.-.-.}, at: [<ffffffff810df6df>] do_adjtimex+0x7f/0x100
      
      <snip>
      
      [   15.850051] Chain exists of: &(&pool->lock)->rlock --> &p->pi_lock --> timekeeper_lock
      [   15.850051]  Possible unsafe locking scenario:
      [   15.850051]
      [   15.850051]        CPU0                    CPU1
      [   15.850051]        ----                    ----
      [   15.850051]   lock(timekeeper_lock);
      [   15.850051]                                lock(&p->pi_lock);
      [   15.850051] lock(timekeeper_lock);
      [   15.850051] lock(&(&pool->lock)->rlock);
      [   15.850051]
      [   15.850051]  *** DEADLOCK ***
      
      The deadlock was introduced by 06c017fd ("timekeeping:
      Hold timekeepering locks in do_adjtimex and hardpps") in 3.10
      
      This patch avoids this deadlock, by moving the call to
      schedule_delayed_work() outside of the timekeeper lock
      critical section.
      Reported-by: NGerlando Falauto <gerlando.falauto@keymile.com>
      Tested-by: NLin Ming <minggr@gmail.com>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: stable <stable@vger.kernel.org> #3.11, 3.10
      Link: http://lkml.kernel.org/r/1378943457-27314-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7bd36014
  5. 29 6月, 2013 2 次提交
  6. 30 5月, 2013 1 次提交
    • C
      power: Add option to log time spent in suspend · 5c83545f
      Colin Cross 提交于
      Below is a patch from android kernel that maintains a histogram of
      suspend times. Please review and provide feedback.
      
      Statistices on the time spent in suspend are kept in
      /sys/kernel/debug/sleep_time.
      
      Cc: Android Kernel Team <kernel-team@android.com>
      Cc: Colin Cross <ccross@android.com>
      Cc: Todd Poynor <toddpoynor@google.com>
      Cc: San Mehat <san@google.com>
      Cc: Benoit Goby <benoit@android.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NColin Cross <ccross@android.com>
      Signed-off-by: NTodd Poynor <toddpoynor@google.com>
      [zoran.markovic@linaro.org: Re-formatted suspend time table to better
      fit expected values. Moved accounting of suspend time into timekeeping
      core. Removed CONFIG_SUSPEND_TIME flag and made the feature conditional
      on CONFIG_DEBUG_FS. Changed the file name to sleep_time to better fit
      terminology in timekeeping core. Changed seq_printf to seq_puts. Tweaked
      commit message]
      Signed-off-by: NZoran Markovic <zoran.markovic@linaro.org>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      5c83545f
  7. 29 5月, 2013 1 次提交
    • Z
      timekeeping: Correct run-time detection of persistent_clock. · 0d6bd995
      Zoran Markovic 提交于
      Since commit 31ade306, timekeeping_init()
      checks for presence of persistent clock by attempting to read a non-zero
      time value. This is an issue on platforms where persistent_clock (instead
      is implemented as a free-running counter (instead of an RTC) starting
      from zero on each boot and running during suspend. Examples are some ARM
      platforms (e.g. PandaBoard).
      
      An attempt to read such a clock during timekeeping_init() may return zero
      value and falsely declare persistent clock as missing. Additionally, in
      the above case suspend times may be accounted twice (once from
      timekeeping_resume() and once from rtc_resume()), resulting in a gradual
      drift of system time.
      
      This patch does a run-time correction of the issue by doing the same check
      during timekeeping_suspend().
      
      A better long-term solution would have to return error when trying to read
      non-existing clock and zero when trying to read an uninitialized clock, but
      that would require changing all persistent_clock implementations.
      
      This patch addresses the immediate breakage, for now.
      
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NZoran Markovic <zoran.markovic@linaro.org>
      [jstultz: Tweaked commit message and subject]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      0d6bd995
  8. 16 5月, 2013 2 次提交
  9. 23 4月, 2013 1 次提交
  10. 11 4月, 2013 1 次提交
  11. 05 4月, 2013 10 次提交
  12. 26 3月, 2013 1 次提交
  13. 23 3月, 2013 7 次提交
  14. 16 3月, 2013 1 次提交
    • F
      timekeeping: utilize the suspend-nonstop clocksource to count suspended time · e445cf1c
      Feng Tang 提交于
      There are some new processors whose TSC clocksource won't stop during
      suspend. Currently, after system resumes, kernel will use persistent
      clock or RTC to compensate the sleep time, but with these nonstop
      clocksources, we could skip the special compensation from external
      sources, and just use current clocksource for time recounting.
      
      This can solve some time drift bugs caused by some not-so-accurate or
      error-prone RTC devices.
      
      The current way to count suspended time is first try to use the persistent
      clock, and then try the RTC if persistent clock can't be used. This
      patch will change the trying order to:
      	suspend-nonstop clocksource -> persistent clock -> RTC
      
      When counting the sleep time with nonstop clocksource, use an accurate way
      suggested by Jason Gunthorpe to cover very large delta cycles.
      Signed-off-by: NFeng Tang <feng.tang@intel.com>
      [jstultz: Small optimization, avoiding re-reading the clocksource]
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      e445cf1c
  15. 16 1月, 2013 2 次提交
  16. 25 12月, 2012 1 次提交
    • S
      time: convert arch_gettimeoffset to a pointer · 7b1f6207
      Stephen Warren 提交于
      Currently, whenever CONFIG_ARCH_USES_GETTIMEOFFSET is enabled, each
      arch core provides a single implementation of arch_gettimeoffset(). In
      many cases, different sub-architectures, different machines, or
      different timer providers exist, and so the arch ends up implementing
      arch_gettimeoffset() as a call-through-pointer anyway. Examples are
      ARM, Cris, M68K, and it's arguable that the remaining architectures,
      M32R and Blackfin, should be doing this anyway.
      
      Modify arch_gettimeoffset so that it itself is a function pointer, which
      the arch initializes. This will allow later changes to move the
      initialization of this function into individual machine support or timer
      drivers. This is particularly useful for code in drivers/clocksource
      which should rely on an arch-independant mechanism to register their
      implementation of arch_gettimeoffset().
      
      This patch also converts the Cris architecture to set arch_gettimeoffset
      directly to the final implementation in time_init(), because Cris already
      had separate time_init() functions per sub-architecture. M68K and ARM
      are converted to set arch_gettimeoffset to the final implementation in
      later patches, because they already have function pointers in place for
      this purpose.
      
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Acked-by: NJohn Stultz <johnstul@us.ibm.com>
      Signed-off-by: NStephen Warren <swarren@nvidia.com>
      7b1f6207
  17. 28 11月, 2012 1 次提交
  18. 14 11月, 2012 1 次提交
  19. 10 10月, 2012 1 次提交