- 16 2月, 2015 2 次提交
-
-
由 Rafael J. Wysocki 提交于
The efficiency of suspend-to-idle depends on being able to keep CPUs in the deepest available idle states for as much time as possible. Ideally, they should only be brought out of idle by system wakeup interrupts. However, timer interrupts occurring periodically prevent that from happening and it is not practical to chase all of the "misbehaving" timers in a whack-a-mole fashion. A much more effective approach is to suspend the local ticks for all CPUs and the entire timekeeping along the lines of what is done during full suspend, which also helps to keep suspend-to-idle and full suspend reasonably similar. The idea is to suspend the local tick on each CPU executing cpuidle_enter_freeze() and to make the last of them suspend the entire timekeeping. That should prevent timer interrupts from triggering until an IO interrupt wakes up one of the CPUs. It needs to be done with interrupts disabled on all of the CPUs, though, because otherwise the suspended clocksource might be accessed by an interrupt handler which might lead to fatal consequences. Unfortunately, the existing ->enter callbacks provided by cpuidle drivers generally cannot be used for implementing that, because some of them re-enable interrupts temporarily and some idle entry methods cause interrupts to be re-enabled automatically on exit. Also some of these callbacks manipulate local clock event devices of the CPUs which really shouldn't be done after suspending their ticks. To overcome that difficulty, introduce a new cpuidle state callback, ->enter_freeze, that will be guaranteed (1) to keep interrupts disabled all the time (and return with interrupts disabled) and (2) not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to look for the deepest available idle state with ->enter_freeze present and to make the CPU execute that callback with suspended tick (and the last of the online CPUs to execute it with suspended timekeeping). Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
由 Rafael J. Wysocki 提交于
Theoretically, ktime_get_mono_fast_ns() may be executed after timekeeping has been suspended (or before it is resumed) which in turn may lead to undefined behavior, for example, when the clocksource read from timekeeping_get_ns() called by it is not accessible at that time. Prevent that from happening by setting up a dummy readout base for the fast timekeeper during timekeeping_suspend() such that it will always return the same number of cycles. After the last timekeeping_update() in timekeeping_suspend() the clocksource is read and the result is stored as cycles_at_suspend. The readout base from the current timekeeper is copied onto the dummy and the ->read pointer of the dummy is set to a routine unconditionally returning cycles_at_suspend. Next, the dummy is passed to update_fast_timekeeper(). Then, ktime_get_mono_fast_ns() will work until the subsequent timekeeping_resume() and the proper readout base for the fast timekeeper will be restored by the timekeeping_update() called right after clearing timekeeping_suspended. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NJohn Stultz <john.stultz@linaro.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
-
- 14 2月, 2015 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Modify update_fast_timekeeper() to take a struct tk_read_base pointer as its argument (instead of a struct timekeeper pointer) and update its kerneldoc comment to reflect that. That will allow a struct tk_read_base that is not part of a struct timekeeper to be passed to it in the next patch. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJohn Stultz <john.stultz@linaro.org>
-
- 24 1月, 2015 1 次提交
-
-
由 John Stultz 提交于
Adds a timespec64 based getboottime64() implementation that can be used as we convert internal users of getboottime away from using timespecs. Cc: pang.xunlei <pang.xunlei@linaro.org> Cc: Arnd Bergmann <arnd.bergmann@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
- 25 11月, 2014 1 次提交
-
-
由 John Stultz 提交于
In commit 6067dc5a ("time: Avoid possible NTP adjustment mult overflow") a new check was added to watch for adjustments that could cause a mult overflow. Unfortunately the check compares a signed with unsigned value and ignored the case where the adjustment was negative, which causes spurious warn-ons on some systems (and seems like it would result in problematic time adjustments there as well, due to the early return). Thus this patch adds a check to make sure the adjustment is positive before we check for an overflow, and resovles the issue in my testing. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Debugged-by: Npang.xunlei <pang.xunlei@linaro.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/1416890145-30048-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 11月, 2014 7 次提交
-
-
由 John Stultz 提交于
Fix up a few comments that weren't updated when the functions were converted to use timespec64 structures. Acked-by: NArnd Bergmann <arnd.bergmann@linaro.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 John Stultz 提交于
Adds a timespec64 based get_monotonic_coarse64() implementation that can be used as we convert internal users of get_monotonic_coarse away from using timespecs. Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 John Stultz 提交于
Adds a timespec64 based getrawmonotonic64() implementation that can be used as we convert internal users of getrawmonotonic away from using timespecs. Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 pang.xunlei 提交于
As part of addressing "y2038 problem" for in-kernel uses, this patch adds timekeeping_inject_sleeptime64() using timespec64. After this patch, timekeeping_inject_sleeptime() is deprecated and all its call sites will be fixed using the new interface, after that it can be removed. NOTE: timekeeping_inject_sleeptime() is safe actually, but we want to eliminate timespec eventually, so comes this patch. Signed-off-by: Npang.xunlei <pang.xunlei@linaro.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 pang.xunlei 提交于
The kernel uses 32-bit signed value(time_t) for seconds elapsed 1970-01-01:00:00:00, thus it will overflow at 2038-01-19 03:14:08 on 32-bit systems. This is widely known as the y2038 problem. As part of addressing "y2038 problem" for in-kernel uses, this patch adds safe do_settimeofday64() using timespec64. After this patch, do_settimeofday() is deprecated and all its call sites will be fixed using do_settimeofday64(), after that it can be removed. Signed-off-by: Npang.xunlei <pang.xunlei@linaro.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 pang.xunlei 提交于
The clocksource mult-adjustment threshold is [mult-maxadj, mult+maxadj], timekeeping_adjust() only deals with the upper threshold, but misses the lower threshold. This patch adds the lower threshold judging condition. Signed-off-by: Npang.xunlei <pang.xunlei@linaro.org> [jstultz: Minor fix for > 80 char line] Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 pang.xunlei 提交于
Ideally, __clocksource_updatefreq_scale, selects the largest shift value possible for a clocksource. This results in the mult memember of struct clocksource being particularly large, although not so large that NTP would adjust the clock to cause it to overflow. That said, nothing actually prohibits an overflow from occuring, its just that it "shouldn't" occur. So while very unlikely, and so far never observed, the value of (cs->mult+cs->maxadj) may have a chance to reach very near 0xFFFFFFFF, so there is a possibility it may overflow when doing NTP positive adjustment See the following detail: When NTP slewes the clock, kernel goes through update_wall_time()->...->timekeeping_apply_adjustment(): tk->tkr.mult += mult_adj; Since there is no guard against it, its possible tk->tkr.mult may overflow during this operation. This patch avoids any possible mult overflow by judging the overflow case before adding mult_adj to mult, also adds the WARNING message when capturing such case. Signed-off-by: Npang.xunlei <pang.xunlei@linaro.org> [jstultz: Reworded commit message] Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
- 29 10月, 2014 2 次提交
-
-
由 Heena Sirwani 提交于
ktime_get_real_seconds() is the replacement function for get_seconds() returning the seconds portion of CLOCK_REALTIME in a time64_t. For 64bit the function is equivivalent to get_seconds(), but for 32bit it protects the readout with the timekeeper sequence count. This is required because 32-bit machines cannot access 64-bit tk->xtime_sec variable atomically. [tglx: Massaged changelog and added docbook comment ] Signed-off-by: NHeena Sirwani <heenasirwani@gmail.com> Reviewed-by: NArnd Bergman <arnd@arndb.de> Cc: John Stultz <john.stultz@linaro.org> Cc: opw-kernel@googlegroups.com Link: http://lkml.kernel.org/r/7adcfaa8962b8ad58785d9a2456c3f77d93c0ffb.1414578445.git.heenasirwani@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Heena Sirwani 提交于
This is the counterpart to get_seconds() based on CLOCK_MONOTONIC. The use case for this interface are kernel internal coarse grained timestamps which do neither require the nanoseconds fraction of current time nor the CLOCK_REALTIME properties. Such timestamps can currently only retrieved by calling ktime_get_ts64() and using the tv_sec field of the returned timespec64. That's inefficient as it involves the read of the clocksource, math operations and must be protected by the timekeeper sequence counter. To avoid the sequence counter protection we restrict the return value to unsigned 32bit on 32bit machines. This covers ~136 years of uptime and therefor an overflow is not expected to hit anytime soon. To avoid math in the function we calculate the current seconds portion of CLOCK_MONOTONIC when the timekeeper gets updated in tk_update_ktime_data() similar to the CLOCK_REALTIME counterpart xtime_sec. [ tglx: Massaged changelog, simplified and commented the update function, added docbook comment ] Signed-off-by: NHeena Sirwani <heenasirwani@gmail.com> Reviewed-by: NArnd Bergman <arnd@arndb.de> Cc: John Stultz <john.stultz@linaro.org> Cc: opw-kernel@googlegroups.com Link: http://lkml.kernel.org/r/da0b63f4bdf3478909f92becb35861197da3a905.1414578445.git.heenasirwani@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 06 9月, 2014 1 次提交
-
-
由 Thomas Gleixner 提交于
The update_walltime() code works on the shadow timekeeper to make the seqcount protected region as short as possible. But that update to the shadow timekeeper does not update all timekeeper fields because it's sufficient to do that once before it becomes life. One of these fields is tkr.base_mono. That stays stale in the shadow timekeeper unless an operation happens which copies the real timekeeper to the shadow. The update function is called after the update calls to vsyscall and pvclock. While not correct, it did not cause any problems because none of the invoked update functions used base_mono. commit cbcf2dd3 (x86: kvm: Make kvm_get_time_and_clockread() nanoseconds based) changed that in the kvm pvclock update function, so the stale mono_base value got used and caused kvm-clock to malfunction. Put the update where it belongs and fix the issue. Reported-by: NChris J Arges <chris.j.arges@canonical.com> Reported-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409050000570.3333@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 15 8月, 2014 1 次提交
-
-
由 John Stultz 提交于
Benjamin Herrenschmidt pointed out that I further missed modifying update_vsyscall after the wall_to_mono value was changed to a timespec64. This causes issues on powerpc32, which expects a 32bit timespec. This patch fixes the problem by properly converting from a timespec64 to a timespec before passing the value on to the arch-specific vsyscall logic. [ Thomas is currently on vacation, but reviewed it and wanted me to send this fix on to you directly. ] Cc: LKML <linux-kernel@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 7月, 2014 24 次提交
-
-
由 John Stultz 提交于
By caching the ntp_tick_length() when we correct the frequency error, and then using that cached value to accumulate error, we avoid large initial errors when the tick length is changed. This makes convergence happen much faster in the simulator, since the initial error doesn't have to be slowly whittled away. This initially seems like an accounting error, but Miroslav pointed out that ntp_tick_length() can change mid-tick, so when we apply it in the error accumulation, we are applying any recent change to the entire tick. This approach chooses to apply changes in the ntp_tick_length() only to the next tick, which allows us to calculate the freq correction before using the new tick length, which avoids accummulating error. Credit to Miroslav for pointing this out and providing the original patch this functionality has been pulled out from, along with the rational. Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Reported-by: NMiroslav Lichvar <mlichvar@redhat.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 John Stultz 提交于
The existing timekeeping_adjust logic has always been complicated to understand. Further, since it was developed prior to NOHZ becoming common, its not surprising it performs poorly when NOHZ is enabled. Since Miroslav pointed out the problematic nature of the existing code in the NOHZ case, I've tried to refactor the code to perform better. The problem with the previous approach was that it tried to adjust for the total cumulative error using a scaled dampening factor. This resulted in large errors to be corrected slowly, while small errors were corrected quickly. With NOHZ the timekeeping code doesn't know how far out the next tick will be, so this results in bad over-correction to small errors, and insufficient correction to large errors. Inspired by Miroslav's patch, I've refactored the code to try to address the correction in two steps. 1) Check the future freq error for the next tick, and if the frequency error is large, try to make sure we correct it so it doesn't cause much accumulated error. 2) Then make a small single unit adjustment to correct any cumulative error that has collected over time. This method performs fairly well in the simulator Miroslav created. Major credit to Miroslav for pointing out the issue, providing the original patch to resolve this, a simulator for testing, as well as helping debug and resolve issues in my implementation so that it performed closer to his original implementation. Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Reported-by: NMiroslav Lichvar <mlichvar@redhat.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 John Stultz 提交于
In the GENERIC_TIME_VSYSCALL_OLD update_vsyscall implementation, we take the tk_xtime() value, which returns a timespec64, and store it in a timespec. This luckily is ok, since the only architectures that use GENERIC_TIME_VSYSCALL_OLD are ia64 and ppc64, which are both 64 bit systems where timespec64 is the same as a timespec. Even so, for cleanliness reasons, use the conversion function to assign the proper type. Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Tracers want a correlated time between the kernel instrumentation and user space. We really do not want to export sched_clock() to user space, so we need to provide something sensible for this. Using separate data structures with an non blocking sequence count based update mechanism allows us to do that. The data structure required for the readout has a sequence counter and two copies of the timekeeping data. On the update side: smp_wmb(); tkf->seq++; smp_wmb(); update(tkf->base[0], tk); smp_wmb(); tkf->seq++; smp_wmb(); update(tkf->base[1], tk); On the reader side: do { seq = tkf->seq; smp_rmb(); idx = seq & 0x01; now = now(tkf->base[idx]); smp_rmb(); } while (seq != tkf->seq) So if a NMI hits the update of base[0] it will use base[1] which is still consistent, but this timestamp is not guaranteed to be monotonic across an update. The timestamp is calculated by: now = base_mono + clock_delta * slope So if the update lowers the slope, readers who are forced to the not yet updated second array are still using the old steeper slope. tmono ^ | o n | o n | u | o |o |12345678---> reader order o = old slope u = update n = new slope So reader 6 will observe time going backwards versus reader 5. While other CPUs are likely to be able observe that, the only way for a CPU local observation is when an NMI hits in the middle of the update. Timestamps taken from that NMI context might be ahead of the following timestamps. Callers need to be aware of that and deal with it. V2: Got rid of clock monotonic raw and reorganized the data structures. Folded in the barrier fix from Mathieu. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
All the function needs is in the tk_read_base struct. No functional change for the current code, just a preparatory patch for the NMI safe accessor to clock monotonic which will use struct tk_read_base as well. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
The members of the new struct are the required ones for the new NMI safe accessor to clcok monotonic. In order to reuse the existing timekeeping code and to make the update of the fast NMI safe timekeepers a simple memcpy use the struct for the timekeeper as well and convert all users. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Access to time requires to touch two cachelines at minimum 1) The timekeeper data structure 2) The clocksource data structure The access to the clocksource data structure can be avoided as almost all clocksource implementations ignore the argument to the read callback, which is a pointer to the clocksource. But the core needs to touch it to access the members @read and @mask. So we are better off by copying the @read function pointer and the @mask from the clocksource to the core data structure itself. For the most used ktime_get() access all required data including the @read and @mask copies fits together with the sequence counter into a single 64 byte cacheline. For the other time access functions we touch in the current code three cache lines in the worst case. But with the clocksource data copies we can reduce that to two adjacent cachelines, which is more efficient than disjunct cache lines. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
cycle_last was added to the clocksource to support the TSC validation. We moved that to the core code, so we can get rid of the extra copy. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
We want to move the TSC sanity check into core code to make NMI safe accessors to clock monotonic[_raw] possible. For this we need to sanity check the delta calculation. Create a helper function and convert all sites to use it. [ Build fix from jstultz ] Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Provide a ktime_t based interface for raw monotonic time. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
timekeeping_clocktai() is not used in fast pathes, so the extra timespec conversion is not problematic. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
No more users. Remove it Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Subtracting plain nsec values and converting to timespec is simpler than the whole timespec math. Not really fastpath code, so the division is not an issue. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
get_monotonic_boottime() is not used in fast pathes, so the extra timespec conversion is not problematic. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
No more users. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
No more users. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
ktime based conversion function to map a monotonic time stamp to a different CLOCK. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
No need to juggle with timespecs. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
No need to juggle with timespecs. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Speed up the readout. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Provide a helper function which lets us implement ktime_t based interfaces for real, boot and tai clocks. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Thomas Gleixner 提交于
Speed up ktime_get() by using ktime_t based data. Text size shrinks by 64 bytes on x8664. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-