- 25 9月, 2012 4 次提交
-
-
由 John Stultz 提交于
CLOCK_TICK_RATE is used to accurately caclulate exactly how a tick will be at a given HZ. This is useful, because while we'd expect NSEC_PER_SEC/HZ, the underlying hardware will have some granularity limit, so we won't be able to have exactly HZ ticks per second. This slight error can cause timekeeping quality problems when using the jiffies or other jiffies driven clocksources. Thus we currently use compile time CLOCK_TICK_RATE value to generate SHIFTED_HZ and NSEC_PER_JIFFIES, which we then use to adjust the jiffies clocksource to correct this error. Unfortunately though, since CLOCK_TICK_RATE is a compile time value, and the jiffies clocksource is registered very early during boot, there are a number of cases where there are different possible hardware timers that have different tick rates. This causes problems in cases like ARM where there are numerous different types of hardware, each having their own compile-time CLOCK_TICK_RATE, making it hard to accurately support different hardware with a single kernel. For the most part, this doesn't matter all that much, as not too many systems actually utilize the jiffies or jiffies driven clocksource. Usually there are other highres clocksources who's granularity error is negligable. Even so, we have some complicated calcualtions that we do everywhere to handle these edge cases. This patch removes the compile time SHIFTED_HZ value, and introduces a register_refined_jiffies() function. This results in the default jiffies clock as being assumed a perfect HZ freq, and allows archtectures that care about jiffies accuracy to call register_refined_jiffies() with the tick rate, specified dynamically at boot. This allows us, where necessary, to not have a compile time CLOCK_TICK_RATE constant, simplifies the jiffies code, and still provides a way to have an accurate jiffies clock. NOTE: Since this patch does not add register_refinied_jiffies() calls for every arch, it may cause time quality regressions in some cases. Its likely these will not be noticable, but if they are an issue, adding the following to the end of setup_arch() should resolve the regression: register_refinied_jiffies(CLOCK_TICK_RATE) Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 John Stultz 提交于
Now that alarmtimer_remove has been simplified, change its name to _dequeue to better match its paired _enqueue function. Cc: Arve Hjønnevåg <arve@android.com> Cc: Colin Cross <ccross@android.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 John Stultz 提交于
Arve Hjønnevåg reported numerous crashes from the "BUG_ON(timer->state != HRTIMER_STATE_CALLBACK)" check in __run_hrtimer after it called alarmtimer_fired. It ends up the alarmtimer code was not properly handling possible failures of hrtimer_try_to_cancel, and because these faulres occur when the underlying base hrtimer is being run, this limits the ability to properly handle modifications to any alarmtimers on that base. Because much of the logic duplicates the hrtimer logic, it seems that we might as well have a per-alarmtimer hrtimer, and avoid the extra complextity of trying to multiplex many alarmtimers off of one hrtimer. Thus this patch moves the hrtimer to the alarm structure and simplifies the management logic. Changelog: v2: * Includes a fix for double alarm_start calls found by Arve Cc: Arve Hjønnevåg <arve@android.com> Cc: Colin Cross <ccross@android.com> Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: NArve Hjønnevåg <arve@android.com> Tested-by: NArve Hjønnevåg <arve@android.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Todd Poynor 提交于
alarmtimer suspend return -EBUSY if the next alarm will fire in less than 2 seconds. This allows one RTC seconds tick to occur subsequent to this check before the alarm wakeup time is set, ensuring the wakeup time is still in the future (assuming the RTC does not tick one more second prior to setting the alarm). If suspend is rejected due to an imminent alarm, hold a wakeup source for 2 seconds to process the alarm prior to reattempting suspend. If setting the alarm incurs an -ETIME for an alarm set in the past, or any other problem setting the alarm, abort suspend and hold a wakelock for 1 second while the alarm is allowed to be serviced or other hopefully transient conditions preventing the alarm clear up. Signed-off-by: NTodd Poynor <toddpoynor@google.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
- 13 9月, 2012 1 次提交
-
-
由 John Stultz 提交于
Daniel Lezcano reported seeing multi-second stalls from keyboard input on his T61 laptop when NOHZ and CPU_IDLE were enabled on a 32bit kernel. He bisected the problem down to commit 1e75fa8b ("time: Condense timekeeper.xtime into xtime_sec"). After reproducing this issue, I narrowed the problem down to the fact that timekeeping_get_ns() returns a 64bit nsec value that hasn't been accumulated. In some cases this value was being then stored in timespec.tv_nsec (which is a long). On 32bit systems, with idle times larger then 4 seconds (or less, depending on the value of xtime_nsec), the returned nsec value would overflow 32bits. This limited kept time from increasing, causing timers to not expire. The fix is to make sure we don't directly store the result of timekeeping_get_ns() into a tv_nsec field, instead using a 64bit nsec value which can then be added into the timespec via timespec_add_ns(). Reported-and-bisected-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Tested-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Acked-by: NPrarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Link: http://lkml.kernel.org/r/1347405963-35715-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 02 9月, 2012 1 次提交
-
-
由 John Stultz 提交于
Andreas Bombe reported that the added ktime_t overflow checking added to timespec_valid in commit 4e8b1452 ("time: Improve sanity checking of timekeeping inputs") was causing problems with X.org because it caused timeouts larger then KTIME_T to be invalid. Previously, these large timeouts would be clamped to KTIME_MAX and would never expire, which is valid. This patch splits the ktime_t overflow checking into a new timespec_valid_strict function, and converts the timekeeping codes internal checking to use this more strict function. Reported-and-tested-by: NAndreas Bombe <aeb@debian.org> Cc: Zhouping Liu <zliu@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 8月, 2012 4 次提交
-
-
由 John Stultz 提交于
If update_wall_time() is called and the current offset isn't large enough to accumulate, avoid re-calling timekeeping_adjust which may change the clock freq and can cause 1ns inconsistencies with CLOCK_REALTIME_COARSE/CLOCK_MONOTONIC_COARSE. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1345595449-34965-5-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
Andreas Schwab noticed that the 1 << tk->shift could overflow if the shift value was greater than 30, since 1 would be a 32bit long on 32bit architectures. This issue was introduced by 1e75fa8b (time: Condense timekeeper.xtime into xtime_sec) Use 1ULL instead to ensure we don't overflow on the shift. Reported-by: NAndreas Schwab <schwab@linux-m68k.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1345595449-34965-4-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Andreas Schwab 提交于
arch_gettimeoffset returns a u32 value which when shifted by tk->shift can overflow. This issue was introduced with 1e75fa8b (time: Condense timekeeper.xtime into xtime_sec) Cast it to u64 first. Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1345595449-34965-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
Andreas noticed problems with resume on specific hardware after commit 1e75fa8b (time: Condense timekeeper.xtime into xtime_sec) combined with commit b44d50dc (time: Fix casting issue in tk_set_xtime and tk_xtime_add) After some digging I realized we aren't normalizing the timekeeper after the add. Add the missing normalize call. Reported-by: NAndreas Schwab <schwab@linux-m68k.org> Tested-by: NAndreas Schwab <schwab@linux-m68k.org> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1345595449-34965-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 15 8月, 2012 1 次提交
-
-
由 John Stultz 提交于
Unexpected behavior could occur if the time is set to a value large enough to overflow a 64bit ktime_t (which is something larger then the year 2262). Also unexpected behavior could occur if large negative offsets are injected via adjtimex. So this patch improves the sanity check timekeeping inputs by improving the timespec_valid() check, and then makes better use of timespec_valid() to make sure we don't set the time to an invalid negative value or one that overflows ktime_t. Note: This does not protect from setting the time close to overflowing ktime_t and then letting natural accumulation cause the overflow. Reported-by: NCAI Qian <caiqian@redhat.com> Reported-by: NSasha Levin <levinsasha928@gmail.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Zhouping Liu <zliu@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1344454580-17031-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 05 8月, 2012 1 次提交
-
-
由 Ingo Molnar 提交于
Tetsuo Handa reported that sporadically the system clock starts counting up too quickly which is enough to confuse the hangcheck timer to print a bogus stall warning. Commit 2a8c0883 "time: Move xtime_nsec adjustment underflow handling timekeeping_adjust" overlooked this exit path: } else return; which should really be a proper exit sequence, fixing the bug as a side effect. Also make the flow more readable by properly balancing curly braces. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: Signed-off-by: NIngo Molnar <mingo@kernel.org> Cc: john.stultz@linaro.org Cc: a.p.zijlstra@chello.nl Cc: richardcochran@gmail.com Cc: prarit@redhat.com Link: http://lkml.kernel.org/r/20120804192114.GA28347@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 31 7月, 2012 4 次提交
-
-
由 John Stultz 提交于
Ingo noted that the numerous timekeeper.value references made the timekeeping code ugly and caused many long lines that had to be broken up. He recommended replacing timekeeper.value references with tk->value. This patch provides a local tk value for all top level time functions and sets it to &timekeeper. Then all timekeeper access is done via a tk pointer. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1343414893-45779-6-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 John Stultz 提交于
For performance reasons, we maintain ktime_t based duplicates of wall_to_monotonic (offs_real) and total_sleep_time (offs_boot). Since large problems could occur (such as the resume regression on 3.5-rc7, or the leapsecond hrtimer issue) if these value pairs were to be inconsistently updated, this patch this cleans up how we modify these value pairs to ensure we are always consistent. As a side-effect this is also more efficient as we only caulculate the duplicate values when they are changed, rather then every update_wall_time call. This also provides WARN_ONs to detect if future changes break the invariants. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1343414893-45779-5-git-send-email-john.stultz@linaro.org [ Cleaned up minor style issues. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 John Stultz 提交于
Ingo noted inconsistent newline usage between functions. This patch cleans those up. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1343414893-45779-4-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 John Stultz 提交于
Ingo noted that ACTHZ is a confusing name, and requested it be renamed, so this patch renames ACTHZ to SHIFTED_HZ to better describe it. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1343414893-45779-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 7月, 2012 1 次提交
-
-
由 John Stultz 提交于
commit 1e75fa8b (time: Condense timekeeper.xtime into xtime_sec) introduced helper functions which apply a timespec to the core internal timekeeper data. The internal storage type is u64. The timespec tv_nsec value must be shifted before set or added to the internal value. tv_nsec is a long, which is 32bit on a 32bit system, so without casting tv_nsec to u64 we lose the bits which are shifted over the 32bit boundary. Add the proper typecasts. Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Acked-by: NPrarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1343074957-16541-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 17 7月, 2012 1 次提交
-
-
由 Thomas Gleixner 提交于
The leap second rework unearthed another issue of inconsistent data. On timekeeping_resume() the timekeeper data is updated, but nothing calls timekeeping_update(), so now the update code in the timer interrupt sees stale values. This has been the case before those changes, but then the timer interrupt was using stale data as well so this went unnoticed for quite some time. Add the missing update call, so all the data is consistent everywhere. Reported-by: NAndreas Schwab <schwab@linux-m68k.org> Reported-and-tested-by: N"Rafael J. Wysocki" <rjw@sisk.pl> Reported-and-tested-by: NMartin Steigerwald <Martin@lichtvoll.de> Cc: LKML <linux-kernel@vger.kernel.org> Cc: Linux PM list <linux-pm@vger.kernel.org> Cc: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>, Cc: Prarit Bhargava <prarit@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 7月, 2012 8 次提交
-
-
由 John Stultz 提交于
As part of cleaning up the timekeeping code, this patch converts a number of internal functions to takei a timekeeper ptr as an argument, so that the internal functions don't access the global timekeeper structure directly. This allows for further optimizations to reduce lock hold time later. This patch has been updated to include more consistent usage of the timekeeper value, by making sure it is always passed as a argument to non top-level functions. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1342156917-25092-9-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
When we make adjustments speeding up the clock, its possible for xtime_nsec to underflow. We already handle this properly, but we do so from update_wall_time() instead of the more logical timekeeping_adjust(), where the possible underflow actually occurs. Thus, move the correction logic to the timekeeping_adjust, which is the function that causes the issue. Making update_wall_time() more readable. Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1342156917-25092-8-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
Since we call arch_gettimeoffset() in all the accessor functions, move arch_gettimeoffset() calls into timekeeping_get_ns() and timekeeping_get_ns_raw() to simplify the code. This also makes the code easier to maintain as we don't have to worry about forgetting the arch_gettimeoffset() as has happened in the past. Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1342156917-25092-7-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
We do the exact same logic moving nsecs to secs in the timekeeper in multiple places, so condense this into a single function. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1342156917-25092-6-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
The timekeeper struct has a xtime_nsec, which keeps the sub-nanosecond remainder. This ends up being somewhat duplicative of the timekeeper.xtime.tv_nsec value, and we have to do extra work to keep them apart, copying the full nsec portion out and back in over and over. This patch simplifies some of the logic by taking the timekeeper xtime value and splitting it into timekeeper.xtime_sec and reuses the timekeeper.xtime_nsec for the sub-second portion (stored in higher res shifted nanoseconds). This simplifies some of the accumulation logic. And will allow for more accurate timekeeping once the vsyscall code is updated to use the shifted nanosecond remainder. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1342156917-25092-5-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
Ingo noted that using a u32 instead of int for shift values would be better to make sure the compiler doesn't unnecessarily use complex signed arithmetic. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1342156917-25092-4-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
Ingo noted a number of places where there is inconsistent use of whitespace. This patch tries to address the main culprits. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Link: http://lkml.kernel.org/r/1342156917-25092-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
In commit 6b43ae8a, I introduced a bug that kept the STA_INS or STA_DEL bit from being cleared from time_status via adjtimex() without forcing STA_PLL first. Usually once the STA_INS is set, it isn't cleared until the leap second is applied, so its unlikely this affected anyone. However during testing I noticed it took some effort to cancel a leap second once STA_INS was set. Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> CC: stable@vger.kernel.org # 3.4 Link: http://lkml.kernel.org/r/1342156917-25092-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 12 7月, 2012 3 次提交
-
-
由 Thomas Gleixner 提交于
To finally fix the infamous leap second issue and other race windows caused by functions which change the offsets between the various time bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a function which atomically gets the current monotonic time and updates the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic overhead. The previous patch which provides ktime_t offsets allows us to make this function almost as cheap as ktime_get() which is going to be replaced in hrtimer_interrupt(). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NIngo Molnar <mingo@kernel.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPrarit Bhargava <prarit@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
We need to update the hrtimer clock offsets from the hrtimer interrupt context. To avoid conversions from timespec to ktime_t maintain a ktime_t based representation of those offsets in the timekeeper. This puts the conversion overhead into the code which updates the underlying offsets and provides fast accessible values in the hrtimer interrupt. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Reviewed-by: NIngo Molnar <mingo@kernel.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPrarit Bhargava <prarit@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1341960205-56738-4-git-send-email-johnstul@us.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
The timekeeping code misses an update of the hrtimer subsystem after a leap second happened. Due to that timers based on CLOCK_REALTIME are either expiring a second early or late depending on whether a leap second has been inserted or deleted until an operation is initiated which causes that update. Unless the update happens by some other means this discrepancy between the timekeeping and the hrtimer data stays forever and timers are expired either early or late. The reported immediate workaround - $ data -s "`date`" - is causing a call to clock_was_set() which updates the hrtimer data structures. See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix Add the missing clock_was_set() call to update_wall_time() in case of a leap second event. The actual update is deferred to softirq context as the necessary smp function call cannot be invoked from hard interrupt context. Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Reported-by: NJan Engelhardt <jengelh@inai.de> Reviewed-by: NIngo Molnar <mingo@kernel.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPrarit Bhargava <prarit@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1341960205-56738-3-git-send-email-johnstul@us.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 06 7月, 2012 1 次提交
-
-
由 Peter Zijlstra 提交于
Thanks to Charles Wang for spotting the defects in the current code: - If we go idle during the sample window -- after sampling, we get a negative bias because we can negate our own sample. - If we wake up during the sample window we get a positive bias because we push the sample to a known active period. So rewrite the entire nohz load-avg muck once again, now adding copious documentation to the code. Reported-and-tested-by: NDoug Smythies <dsmythies@telus.net> Reported-and-tested-by: NCharles Wang <muming.wq@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1340373782.18025.74.camel@twins [ minor edits ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 03 7月, 2012 1 次提交
-
-
由 Paul E. McKenney 提交于
If the nohz= boot parameter disables nohz, then RCU_FAST_NO_HZ needs to also disable itself. This commit therefore checks for tick_nohz_enabled being zero, disabling rcu_prepare_for_idle() if so. This commit assumes that tick_nohz_enabled can change at runtime: If this is not the case, then a simpler approach suffices. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 12 6月, 2012 5 次提交
-
-
由 Frederic Weisbecker 提交于
The next idle expiry time record and idle sleeps tracking are statistics that only concern idle. Since we want the nohz APIs to become usable further idle context, let's pull up the handling of these statistics to the callers in idle. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
-
由 Frederic Weisbecker 提交于
Since we want to prepare for making the nohz API to work further the idle case, we need to pull ts->idle_calls incrementation up to the callers in idle. To perform this, we split tick_nohz_stop_sched_tick() in two parts: a first one that checks if we can really stop the tick for idle, and another that actually stops it. Then from the callers in idle, we check if we can stop the tick and only then we increment idle_calls and finally relay to the nohz API that won't care about these details anymore. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
-
由 Frederic Weisbecker 提交于
Now that idle and nohz logics are going to be independant each others, ts->idle_tick becomes too much a biased name to describe the field that saves the last scheduled tick on top of which we re-calculate the next tick to schedule when the timer is restarted. We want to reuse this even to stop the tick outside idle cases. So let's rename it to some more generic name: ts->last_tick. This changes a bit the timer list stat export so we need to increase its version. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
-
由 Frederic Weisbecker 提交于
When the timer tick fires, it accounts the new jiffy as either part of system, user or idle time. This is how we record the cputime statistics. But when the tick is stopped from the idle task, we still need to record the number of jiffies spent tickless until we restart the tick and fall back to traditional tick-based cputime accounting. To do this, we take a snapshot of jiffies when the tick is stopped and compute the difference against the new value of jiffies when the tick is restarted. Then we account this whole difference to the idle cputime. However we are preparing to be able to stop the tick from other places than idle. So this idle time accounting needs to be performed from the callers of nohz APIs, not from the nohz APIs themselves because we now want them to be agnostic against places that stop/restart tick. Therefore, we pull the tickless idle time accounting out of generic nohz helpers up to idle entry/exit callers. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
-
由 Frederic Weisbecker 提交于
As we plan to be able to stop the tick outside the idle task, we need to prepare for separating nohz logic from idle. As a start, this pulls the idle sleeping time accounting out of the tick stop/restart API to the callers on idle entry/exit. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
-
- 07 6月, 2012 1 次提交
-
-
由 Paul E. McKenney 提交于
When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick() calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the next wakeup time based on the timer wheels. Only later, when actually entering the idle loop, rcu_prepare_for_idle() will be invoked. In some cases, rcu_prepare_for_idle() will post timers to wake the CPU back up. But all for naught: The next wakeup time for the CPU has already been computed, and posting a timer afterwards does not force that wakeup time to be recomputed. This means that rcu_prepare_for_idle()'s have no effect. This is not a problem on a busy system because something else will wake up the CPU soon enough. However, on lightly loaded systems, the CPU might stay asleep for a considerable length of time. If that CPU has a callback that the rest of the system is waiting on, the system might run very slowly or (in theory) even hang. This commit avoids this problem by having rcu_needs_cpu() give tick_nohz_stop_sched_tick() an estimate of when RCU will need the CPU to wake back up, which tick_nohz_stop_sched_tick() takes into account when programming the CPU's wakeup time. An alternative approach is for rcu_prepare_for_idle() to use hrtimers instead of normal timers, but timers are much more efficient than are hrtimers for frequently and repeatedly posting and cancelling a given timer, which is exactly what RCU_FAST_NO_HZ does. Reported-by: NPascal Chapperon <pascal.chapperon@wanadoo.fr> Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Tested-by: NPascal Chapperon <pascal.chapperon@wanadoo.fr>
-
- 05 6月, 2012 1 次提交
-
-
由 John Stultz 提交于
Commit 6b43ae8a (ntp: Fix leap-second hrtimer livelock) broke the leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC. Adjust wall_to_monotonic when NTP inserted a leapsecond. Reported-by: NRichard Cochran <richardcochran@gmail.com> Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Tested-by: NRichard Cochran <richardcochran@gmail.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 30 5月, 2012 1 次提交
-
-
由 Peter Zijlstra 提交于
Follow up on commit 556061b0 ("sched/nohz: Fix rq->cpu_load[] calculations") since while that fixed the busy case it regressed the mostly idle case. Add a callback from the nohz exit to also age the rq->cpu_load[] array. This closes the hole where either there was no nohz load balance pass during the nohz, or there was a 'significant' amount of idle time between the last nohz balance and the nohz exit. So we'll update unconditionally from the tick to not insert any accidental 0 load periods while busy, and we try and catch up from nohz idle balance and nohz exit. Both these are still prone to missing a jiffy, but that has always been the case. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: pjt@google.com Cc: Venkatesh Pallipadi <venki@google.com> Link: http://lkml.kernel.org/n/tip-kt0trz0apodbf84ucjfdbr1a@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 5月, 2012 1 次提交
-
-
由 Thomas Gleixner 提交于
commit 5307c955 (tick: Add tick skew boot option) broke the !CONFIG_HIGH_RES_TIMERS build. Move the boot option parsing into the CONFIG_HIGH_RES_TIMERS section. Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <mgalbraith@suse.de>
-