提交 · f2a5a0854efc62abe7f69e9947842cb135837f9a · openanolis / cloud-kernel

15 7月, 2012 6 次提交

time: Move arch_gettimeoffset() usage into timekeeping_get_ns() · f2a5a085

由 John Stultz 提交于 7月 13, 2012

Since we call arch_gettimeoffset() in all the accessor
functions, move arch_gettimeoffset() calls into
timekeeping_get_ns() and timekeeping_get_ns_raw() to simplify
the code.

This also makes the code easier to maintain as we don't have to
worry about forgetting the arch_gettimeoffset() as has happened
in the past.
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1342156917-25092-7-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f2a5a085

time: Refactor accumulation of nsecs to secs · 1f4f9487

由 John Stultz 提交于 7月 13, 2012

We do the exact same logic moving nsecs to secs in the
timekeeper in multiple places, so condense this into a
single function.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1342156917-25092-6-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

1f4f9487

time: Condense timekeeper.xtime into xtime_sec · 1e75fa8b

由 John Stultz 提交于 7月 13, 2012

The timekeeper struct has a xtime_nsec, which keeps the
sub-nanosecond remainder.  This ends up being somewhat
duplicative of the timekeeper.xtime.tv_nsec value, and we
have to do extra work to keep them apart, copying the full
nsec portion out and back in over and over.

This patch simplifies some of the logic by taking the timekeeper
xtime value and splitting it into timekeeper.xtime_sec and
reuses the timekeeper.xtime_nsec for the sub-second portion
(stored in higher res shifted nanoseconds).

This simplifies some of the accumulation logic. And will
allow for more accurate timekeeping once the vsyscall code
is updated to use the shifted nanosecond remainder.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1342156917-25092-5-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

1e75fa8b

time: Explicitly use u32 instead of int for shift values · fee84c43

由 John Stultz 提交于 7月 13, 2012

Ingo noted that using a u32 instead of int for shift values
would be better to make sure the compiler doesn't unnecessarily
use complex signed arithmetic.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1342156917-25092-4-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

fee84c43

time: Whitespace cleanups per Ingo%27s requests · 42e71e81

由 John Stultz 提交于 7月 13, 2012

Ingo noted a number of places where there is inconsistent
use of whitespace. This patch tries to address the main
culprits.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1342156917-25092-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

42e71e81

ntp: Fix STA_INS/DEL clearing bug · 6b1859db

由 John Stultz 提交于 7月 13, 2012

In commit 6b43ae8a, I
introduced a bug that kept the STA_INS or STA_DEL bit
from being cleared from time_status via adjtimex()
without forcing STA_PLL first.

Usually once the STA_INS is set, it isn't cleared
until the leap second is applied, so its unlikely this
affected anyone. However during testing I noticed it
took some effort to cancel a leap second once STA_INS
was set.
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
CC: stable@vger.kernel.org # 3.4
Link: http://lkml.kernel.org/r/1342156917-25092-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6b1859db

12 7月, 2012 3 次提交

timekeeping: Provide hrtimer update function · f6c06abf

由 Thomas Gleixner 提交于 7月 10, 2012

To finally fix the infamous leap second issue and other race windows
caused by functions which change the offsets between the various time
bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a
function which atomically gets the current monotonic time and updates
the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic
overhead. The previous patch which provides ktime_t offsets allows us
to make this function almost as cheap as ktime_get() which is going to
be replaced in hrtimer_interrupt().
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NPrarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f6c06abf

timekeeping: Maintain ktime_t based offsets for hrtimers · 5b9fe759

由 Thomas Gleixner 提交于 7月 10, 2012

We need to update the hrtimer clock offsets from the hrtimer interrupt
context. To avoid conversions from timespec to ktime_t maintain a
ktime_t based representation of those offsets in the timekeeper. This
puts the conversion overhead into the code which updates the
underlying offsets and provides fast accessible values in the hrtimer
interrupt.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NPrarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1341960205-56738-4-git-send-email-johnstul@us.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5b9fe759

timekeeping: Fix leapsecond triggered load spike issue · 4873fa07

由 John Stultz 提交于 7月 10, 2012

The timekeeping code misses an update of the hrtimer subsystem after a
leap second happened. Due to that timers based on CLOCK_REALTIME are
either expiring a second early or late depending on whether a leap
second has been inserted or deleted until an operation is initiated
which causes that update. Unless the update happens by some other
means this discrepancy between the timekeeping and the hrtimer data
stays forever and timers are expired either early or late.

The reported immediate workaround - $ data -s "`date`" - is causing a
call to clock_was_set() which updates the hrtimer data structures.
See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix

Add the missing clock_was_set() call to update_wall_time() in case of
a leap second event. The actual update is deferred to softirq context
as the necessary smp function call cannot be invoked from hard
interrupt context.
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
Reported-by: NJan Engelhardt <jengelh@inai.de>
Reviewed-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NPrarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1341960205-56738-3-git-send-email-johnstul@us.ibm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4873fa07

06 7月, 2012 1 次提交

sched/nohz: Rewrite and fix load-avg computation -- again · 5167e8d5

由 Peter Zijlstra 提交于 6月 22, 2012

Thanks to Charles Wang for spotting the defects in the current code:

 - If we go idle during the sample window -- after sampling, we get a
   negative bias because we can negate our own sample.

 - If we wake up during the sample window we get a positive bias
   because we push the sample to a known active period.

So rewrite the entire nohz load-avg muck once again, now adding
copious documentation to the code.
Reported-and-tested-by: NDoug Smythies <dsmythies@telus.net>
Reported-and-tested-by: NCharles Wang <muming.wq@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1340373782.18025.74.camel@twins
[ minor edits ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

5167e8d5

12 6月, 2012 5 次提交

nohz: Move next idle expiry time record into idle logic area · 84bf1bcc

由 Frederic Weisbecker 提交于 8月 01, 2011

The next idle expiry time record and idle sleeps tracking are
statistics that only concern idle.

Since we want the nohz APIs to become usable further idle
context, let's pull up the handling of these statistics to the
callers in idle.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>

84bf1bcc

nohz: Move ts->idle_calls incrementation into strict idle logic · 5b39939a

由 Frederic Weisbecker 提交于 8月 01, 2011

Since we want to prepare for making the nohz API to work further
the idle case, we need to pull ts->idle_calls incrementation up to
the callers in idle.

To perform this, we split tick_nohz_stop_sched_tick() in two parts:
a first one that checks if we can really stop the tick for idle,
and another that actually stops it. Then from the callers in idle,
we check if we can stop the tick and only then we increment idle_calls
and finally relay to the nohz API that won't care about these details
anymore.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>

5b39939a

nohz: Rename ts->idle_tick to ts->last_tick · f5d411c9

由 Frederic Weisbecker 提交于 7月 31, 2011

Now that idle and nohz logics are going to be independant each others,
ts->idle_tick becomes too much a biased name to describe the field that
saves the last scheduled tick on top of which we re-calculate the next
tick to schedule when the timer is restarted.

We want to reuse this even to stop the tick outside idle cases. So let's
rename it to some more generic name: ts->last_tick.

This changes a bit the timer list stat export so we need to increase its
version.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>

f5d411c9

nohz: Make nohz API agnostic against idle ticks cputime accounting · 2ac0d98f

由 Frederic Weisbecker 提交于 7月 28, 2011

When the timer tick fires, it accounts the new jiffy as either part
of system, user or idle time. This is how we record the cputime
statistics.

But when the tick is stopped from the idle task, we still need
to record the number of jiffies spent tickless until we restart
the tick and fall back to traditional tick-based cputime accounting.

To do this, we take a snapshot of jiffies when the tick is stopped
and compute the difference against the new value of jiffies when
the tick is restarted. Then we account this whole difference to
the idle cputime.

However we are preparing to be able to stop the tick from other places
than idle. So this idle time accounting needs to be performed from
the callers of nohz APIs, not from the nohz APIs themselves because
we now want them to be agnostic against places that stop/restart tick.

Therefore, we pull the tickless idle time accounting out of generic
nohz helpers up to idle entry/exit callers.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>

2ac0d98f

nohz: Separate idle sleeping time accounting from nohz logic · 19f5f736

由 Frederic Weisbecker 提交于 7月 27, 2011

As we plan to be able to stop the tick outside the idle task, we
need to prepare for separating nohz logic from idle. As a start,
this pulls the idle sleeping time accounting out of the tick
stop/restart API to the callers on idle entry/exit.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>

19f5f736

07 6月, 2012 1 次提交

rcu: Precompute RCU_FAST_NO_HZ timer offsets · aa9b1630

由 Paul E. McKenney 提交于 5月 10, 2012

When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick()
calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the
next wakeup time based on the timer wheels. Only later, when actually
entering the idle loop, rcu_prepare_for_idle() will be invoked. In some
cases, rcu_prepare_for_idle() will post timers to wake the CPU back up.
But all for naught: The next wakeup time for the CPU has already been
computed, and posting a timer afterwards does not force that wakeup
time to be recomputed. This means that rcu_prepare_for_idle()'s have
no effect.

This is not a problem on a busy system because something else will wake
up the CPU soon enough. However, on lightly loaded systems, the CPU
might stay asleep for a considerable length of time. If that CPU has
a callback that the rest of the system is waiting on, the system might
run very slowly or (in theory) even hang.

This commit avoids this problem by having rcu_needs_cpu() give
tick_nohz_stop_sched_tick() an estimate of when RCU will need the CPU
to wake back up, which tick_nohz_stop_sched_tick() takes into account
when programming the CPU's wakeup time. An alternative approach is
for rcu_prepare_for_idle() to use hrtimers instead of normal timers,
but timers are much more efficient than are hrtimers for frequently
and repeatedly posting and cancelling a given timer, which is exactly
what RCU_FAST_NO_HZ does.
Reported-by: NPascal Chapperon <pascal.chapperon@wanadoo.fr>
Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: NPascal Chapperon <pascal.chapperon@wanadoo.fr>

aa9b1630

05 6月, 2012 1 次提交

timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond · fad0c66c

由 John Stultz 提交于 5月 30, 2012

Commit 6b43ae8a (ntp: Fix leap-second hrtimer livelock) broke the
leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to
wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC.

Adjust wall_to_monotonic when NTP inserted a leapsecond.
Reported-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Tested-by: NRichard Cochran <richardcochran@gmail.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

fad0c66c

30 5月, 2012 1 次提交

sched/nohz: Fix rq->cpu_load calculations some more · 5aaa0b7a

由 Peter Zijlstra 提交于 5月 17, 2012

Follow up on commit 556061b0 ("sched/nohz: Fix rq->cpu_load[]
calculations") since while that fixed the busy case it regressed the
mostly idle case.

Add a callback from the nohz exit to also age the rq->cpu_load[]
array. This closes the hole where either there was no nohz load
balance pass during the nohz, or there was a 'significant' amount of
idle time between the last nohz balance and the nohz exit.

So we'll update unconditionally from the tick to not insert any
accidental 0 load periods while busy, and we try and catch up from
nohz idle balance and nohz exit. Both these are still prone to missing
a jiffy, but that has always been the case.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: pjt@google.com
Cc: Venkatesh Pallipadi <venki@google.com>
Link: http://lkml.kernel.org/n/tip-kt0trz0apodbf84ucjfdbr1a@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

5aaa0b7a

25 5月, 2012 3 次提交

tick: Move skew_tick option into the HIGH_RES_TIMER section · 62cf20b3

由 Thomas Gleixner 提交于 5月 25, 2012

commit 5307c955 (tick: Add tick skew boot option) broke the
!CONFIG_HIGH_RES_TIMERS build.

Move the boot option parsing into the CONFIG_HIGH_RES_TIMERS section.
Reported-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <mgalbraith@suse.de>

62cf20b3

clockevents: Make clockevents_config() a global symbol · e5400321

由 Magnus Damm 提交于 5月 09, 2012

Make clockevents_config() into a global symbol to allow it to be used
by compiled-in clockevent drivers. This is needed by drivers that want
to update the timer frequency after registration time.
Signed-off-by: NMagnus Damm <damm@opensource.se>
Tested-by: NSimon Horman <horms@verge.net.au>
Cc: arnd@arndb.de
Cc: johnstul@us.ibm.com
Cc: rjw@sisk.pl
Cc: lethal@linux-sh.org
Cc: gregkh@linuxfoundation.org
Cc: olof@lixom.net
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20120509143934.27521.46553.sendpatchset@w520Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

e5400321

tick: Add tick skew boot option · 5307c955

由 Mike Galbraith 提交于 5月 08, 2012

Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Quoting removal commit af5ab277:

"Historically, Linux has tried to make the regular timer tick on the
 various CPUs not happen at the same time, to avoid contention on
 xtime_lock.
    
 Nowadays, with the tickless kernel, this contention no longer happens
 since time keeping and updating are done differently. In addition,
 this skew is actually hurting power consumption in a measurable way on
 many-core systems."

Problems:

- Contrary to the above, systems do encounter contention on both
  xtime_lock and RCU structure locks when the tick is synchronized.
  
- Moderate sized RT systems suffer intolerable jitter due to the tick
  being synchronized.

- SGI reports the same for their large systems.

- Fully utilized systems reap no power saving benefit from skew removal,
  but do suffer from resulting induced lock contention.

- 0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.
Signed-off-by: NMike Galbraith <mgalbraith@suse.de>
Link: http://lkml.kernel.org/r/1336472458.21924.78.camel@marge.simpson.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5307c955

22 5月, 2012 4 次提交

timekeeping: Fix a few minor newline issues. · d239f49d

由 Richard Cochran 提交于 4月 27, 2012

Fix a few minor newline issues.
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

d239f49d

ntp: Fix a stale comment and a few stray newlines. · cd5398be

由 Richard Cochran 提交于 4月 27, 2012

Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

cd5398be

ntp: Correct TAI offset during leap second · dd48d708

由 Richard Cochran 提交于 4月 26, 2012

When repeating a UTC time value during a leap second (when the UTC
time should be 23:59:60), the TAI timescale should not stop. The kernel
NTP code increments the TAI offset one second too late. This patch fixes
the issue by incrementing the offset during the leap second itself.
Signed-off-by: NRichard Cochran <richardcochran@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

dd48d708

timers: Fixup the Kconfig consolidation fallout · 764e0da1

由 Thomas Gleixner 提交于 5月 21, 2012

Sigh, I missed to check which architecture Kconfig files actually
include the core Kconfig file. There are a few which did not. So we
broke them.

Instead of adding the includes to those, we are better off to move the
include to init/Kconfig like we did already with irqs and others.

This does not change anything for the architectures using the old
style periodic timer mode. It just solves the build wreckage there.

For those architectures which use the clock events infrastructure it
moves the include of the core Kconfig file to "General setup" which is
a way more logical place than having it at random locations specified
by the architecture specific Kconfigs.
Reported-by: NIngo Molnar <mingo@kernel.org>
Cc: Anna-Maria Gleixner <anna-maria@glx-um.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

764e0da1

21 5月, 2012 1 次提交

timers: Provide generic Kconfig switches · b5e498ad

由 Thomas Gleixner 提交于 5月 18, 2012

We really don't want all the arch code defining stuff
over and over.

[ anna-maria: Added missing GENERIC_CMOS_UPDATE switch ]
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAnna-Maria Gleixner <anna-maria@glx-um.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Link: http://lkml.kernel.org/r/1337529587.3208.2.camel@dionysosAcked-by: NSam Ravnborg <sam@ravnborg.org>

b5e498ad

21 4月, 2012 1 次提交

alarmtimer: Provide accessor to alarmtimer rtc device · 57c498fa

由 John Stultz 提交于 4月 20, 2012

The Android alarm interface provides a settime call that sets both
the alarmtimer RTC device and CLOCK_REALTIME to the same value.

Since there may be multiple rtc devices, provide a hook to access the
one the alarmtimer infrastructure is using.

CC: Colin Cross <ccross@android.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Android Kernel Team <kernel-team@android.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

57c498fa

20 4月, 2012 2 次提交

tick: Fix the spurious broadcast timer ticks after resume · a6371f80

由 Suresh Siddha 提交于 4月 18, 2012

During resume, tick_resume_broadcast() programs the broadcast timer in
oneshot mode unconditionally. On the platforms where broadcast timer
is not really required, this will generate spurious broadcast timer
ticks upon resume. For example, on the always running apic timer
platforms with HPET, I see spurious hpet tick once every ~5minutes
(which is the 32-bit hpet counter wraparound time).

Similar to boot time, during resume make the oneshot mode setting of
the broadcast clock event device conditional on the state of active
broadcast users.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: svenjoac@gmx.de
Cc: torvalds@linux-foundation.org
Cc: rjw@sisk.pl
Link: http://lkml.kernel.org/r/1334802459.28674.209.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

a6371f80

tick: Ensure that the broadcast device is initialized · b9a6a235

由 Thomas Gleixner 提交于 4月 18, 2012

Santosh found another trap when we avoid to initialize the broadcast
device in the switch_to_oneshot code. The broadcast device might be
still in SHUTDOWN state when we actually need to use it. That
obviously breaks, as set_next_event() is called on a shutdown
device. This did not break on x86, but Suresh analyzed it:

From the review, most likely on Sven's system we are force enabling
the hpet using the pci quirk's method very late. And in this case,
hpet_clockevent (which will be global_clock_event) handler can be
null, specifically as this platform might not be using deeper c-states
and using the reliable APIC timer.

Prior to commit 'fa4da365', that handler will be set to
'tick_handle_oneshot_broadcast' when we switch the broadcast timer to
oneshot mode, even though we don't use it. Post commit
'fa4da365', we stopped switching the broadcast mode to oneshot
as this is not really needed and his platform's global_clock_event's
handler will remain null. While on my SNB laptop, same is set to
'clockevents_handle_noop' because hpet gets enabled very early. (noop
handler on my platform set when the early enabled hpet timer gets
replaced by the lapic timer).

But the commit 'fa4da365' tracked the broadcast timer mode in
the SW as oneshot, even though it didn't touch the HW timer. During
resume however, tick_resume_broadcast() saw the SW broadcast mode as
oneshot and actually programmed the broadcast device also into oneshot
mode. So this triggered the null pointer de-reference after the hpet
wraps around and depending on what the hpet counter is set to. On the
normal platforms where hpet gets enabled early we should be seeing a
spurious interrupt (in my SNB laptop I see one spurious interrupt
after around 5 minutes ;) which is 32-bit hpet counter wraparound
time), but that's a separate issue.

Enforce the mode setting when trying to set an event.
Reported-and-tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: torvalds@linux-foundation.org
Cc: svenjoac@gmx.de
Cc: rjw@sisk.pl
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181723350.2542@ionos

b9a6a235

18 4月, 2012 1 次提交

tick: Fix oneshot broadcast setup really · b435092f

由 Thomas Gleixner 提交于 4月 18, 2012

Sven Joachim reported, that suspend/resume on rc3 trips over a NULL
pointer dereference. Linus spotted the clockevent handler being NULL.

commit fa4da365(clockevents: tTack broadcast device mode change in
tick_broadcast_switch_to_oneshot()) tried to fix a problem with the
broadcast device setup, which was introduced in commit 77b0d60c(
clockevents: Leave the broadcast device in shutdown mode when not
needed).

The initial commit avoided to set up the broadcast device when no
broadcast request bits were set, but that left the broadcast device
disfunctional. In consequence deep idle states which need the
broadcast device were not woken up.

commit fa4da365 tried to fix that by initializing the state of the
broadcast facility, but that missed the fact, that nothing initializes
the event handler and some other state of the underlying clock event
device.

The fix is to revert both commits and make only the mode setting of
the clock event device conditional on the state of active broadcast
users. 

That initializes everything except the low level device mode, but this
happens when the broadcast functionality is invoked by deep idle.
Reported-and-tested-by: NSven Joachim <svenjoac@gmx.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos

b435092f

10 4月, 2012 1 次提交

clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot() · fa4da365

由 Suresh Siddha 提交于 4月 09, 2012

In the commit 77b0d60c,
"clockevents: Leave the broadcast device in shutdown mode when not needed",
we were bailing out too quickly in tick_broadcast_switch_to_oneshot(),
with out tracking the broadcast device mode change to 'TICKDEV_MODE_ONESHOT'.

This breaks the platforms which need broadcast device oneshot services during
deep idle states. tick_broadcast_oneshot_control() thinks that it is
in periodic mode and fails to take proper decisions based on the
CLOCK_EVT_NOTIFY_BROADCAST_[ENTER, EXIT] notifications during deep
idle entry/exit.

Fix this by tracking the broadcast device mode as 'TICKDEV_MODE_ONESHOT',
before leaving the broadcast HW device in shutdown mode if there are no active
requests for the moment.
Reported-and-tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: johnstul@us.ibm.com
Link: http://lkml.kernel.org/r/1334011304.12400.81.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

fa4da365

06 4月, 2012 1 次提交

nohz: Fix stale jiffies update in tick_nohz_restart() · 6f103929

由 Neal Cardwell 提交于 3月 27, 2012

Fix tick_nohz_restart() to not use a stale ktime_t "now" value when
calling tick_do_update_jiffies64(now).

If we reach this point in the loop it means that we crossed a tick
boundary since we grabbed the "now" timestamp, so at this point "now"
refers to a time in the old jiffy, so using the old value for "now" is
incorrect, and is likely to give us a stale jiffies value.

In particular, the first time through the loop the
tick_do_update_jiffies64(now) call is always a no-op, since the
caller, tick_nohz_restart_sched_tick(), will have already called
tick_do_update_jiffies64(now) with that "now" value.

Note that tick_nohz_stop_sched_tick() already uses the correct
approach: when we notice we cross a jiffy boundary, grab a new
timestamp with ktime_get(), and *then* update jiffies.
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1332875377-23014-1-git-send-email-ncardwell@google.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6f103929

31 3月, 2012 1 次提交

tick: Document TICK_ONESHOT config option · 3872c48b

由 Thomas Gleixner 提交于 3月 31, 2012

This option has been selected from arch code as it was assumed that
it's necessary to support oneshot mode clockevent devices. But it's
just a core internal helper to compile tick-oneshot.c if NOHZ or
HIG_RES_TIMERS are selected.
Reported-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

3872c48b

24 3月, 2012 5 次提交

alarmtimer: Don't call rtc_timer_init() when CONFIG_RTC_CLASS=n · c5e14e76

由 Thomas Gleixner 提交于 3月 24, 2012

rtc_timer_init() is not available when CONFIG_RTC_CLASS=n. Provide a
proper wrapper in the RTC section of alarmtimer.c
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>

c5e14e76

kernel-time: fix s/then/than/ spelling errors · 88b28adf

由 Jim Cromie 提交于 3月 14, 2012

Use than for comparisons, like more than.

CC: John Stultz <john.stultz@linaro.org>
Signed-off-by: NJim Cromie <jim.cromie@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

88b28adf

time: remove no_sync_cmos_clock · 335dd858

由 Cesar Eduardo Barros 提交于 2月 11, 2012

Commit 9863c90f (x86, vmware: Remove
deprecated VMI kernel support) removed the only place which set
no_sync_cmos_clock. Since that commit, this variable is never set.
Signed-off-by: NCesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

335dd858

time: Avoid scary backtraces when warning of > 11% adj · e919cfd4

由 John Stultz 提交于 3月 22, 2012

Folks have been getting a number of warnings about time
adjustments > 11%. The WARN_ON leaves a big useless backtrace
so this patch removes it for a printk_once().

I'm still working to narrow down the cause of the > 11% adjustment.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

e919cfd4

alarmtimer: Make sure we initialize the rtctimer · ad30dfa9

由 John Stultz 提交于 3月 23, 2012

jonghwan Choi reported seeing warnings with the alarmtimer
code at suspend/resume time, and pointed out that the
rtctimer isn't being properly initialized.

This patch corrects this issue.
Reported-by: Njonghwan Choi <jhbird.choi@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

ad30dfa9

23 3月, 2012 1 次提交

ntp: Fix leap-second hrtimer livelock · 6b43ae8a

由 John Stultz 提交于 3月 15, 2012

Since commit 7dffa3c6 the ntp
subsystem has used an hrtimer for triggering the leapsecond
adjustment. However, this can cause a potential livelock.

Thomas diagnosed this as the following pattern:
CPU 0                                                    CPU 1
do_adjtimex()
  spin_lock_irq(&ntp_lock);
    process_adjtimex_modes();				 timer_interrupt()
      process_adj_status();                                do_timer()
        ntp_start_leap_timer();                             write_lock(&xtime_lock);
          hrtimer_start();                                  update_wall_time();
             hrtimer_reprogram();                            ntp_tick_length()
               tick_program_event()                            spin_lock(&ntp_lock);
                 clockevents_program_event()
		   ktime_get()
                     seq = req_seqbegin(xtime_lock);

This patch tries to avoid the problem by reverting back to not using
an hrtimer to inject leapseconds, and instead we handle the leapsecond
processing in the second_overflow() function.

The downside to this change is that on systems that support highres
timers, the leap second processing will occur on a HZ tick boundary,
(ie: ~1-10ms, depending on HZ)  after the leap second instead of
possibly sooner (~34us in my tests w/ x86_64 lapic).

This patch applies on top of tip/timers/core.

CC: Sasha Levin <levinsasha928@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: NSasha Levin <levinsasha928@gmail.com>
Diagnoised-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

6b43ae8a

16 3月, 2012 1 次提交

time: Fix change_clocksource locking · f695cf94

由 John Stultz 提交于 3月 14, 2012

change_clocksource() fails to grab locks or call timekeeping_update(),
which leaves a race window for time inconsistencies.

This adds proper locking and a call to timekeeping_update() to fix this.

CC: Andy Lutomirski <luto@amacapital.net>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

f695cf94

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功