提交 · f7ea0fd639c2c48d3c61b6eec75362be290c6874 · openeuler / raspberrypi-kernel

14 5月, 2013 1 次提交

tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline · f7ea0fd6

由 Thomas Gleixner 提交于 5月 13, 2013

commit 5b39939a (nohz: Move ts->idle_calls incrementation into strict
idle logic) moved code out of tick_nohz_stop_sched_tick() and missed
to bail out when the cpu is offline. That's causing subsequent
failures as an offline CPU is supposed to die and not to fiddle with
nohz magic.

Return false in can_stop_idle_tick() if the cpu is offline.
Reported-and-tested-by: NJiri Kosina <jkosina@suse.cz>
Reported-and-tested-by: NPrarit Bhargava <prarit@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305132138160.2863@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f7ea0fd6

12 5月, 2013 1 次提交

tick: Cleanup NOHZ per cpu data on cpu down · 4b0c0f29

由 Thomas Gleixner 提交于 5月 03, 2013

Prarit reported a crash on CPU offline/online. The reason is that on
CPU down the NOHZ related per cpu data of the dead cpu is not cleaned
up. If at cpu online an interrupt happens before the per cpu tick
device is registered the irq_enter() check potentially sees stale data
and dereferences a NULL pointer.

Cleanup the data after the cpu is dead.
Reported-by: NPrarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Cc: Mike Galbraith <bitbucket@online.de>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305031451561.2886@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4b0c0f29

05 5月, 2013 1 次提交

tick: Use zalloc_cpumask_var for allocating offstack cpumasks · fbd44a60

由 Thomas Gleixner 提交于 5月 03, 2013

commit b352bc1c (tick: Convert broadcast cpu bitmaps to
cpumask_var_t) broke CONFIG_CPUMASK_OFFSTACK in a very subtle way.

Instead of allocating the cpumasks with zalloc_cpumask_var it uses
alloc_cpumask_var, so we can get random data there, which of course
confuses the logic completely and causes random failures.
Reported-and-tested-by: NDave Jones <davej@redhat.com>
Reported-and-tested-by: NYinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305032015060.2990@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

fbd44a60

25 4月, 2013 1 次提交

clockevents: Set dummy handler on CPU_DEAD shutdown · 6f7a05d7

由 Thomas Gleixner 提交于 4月 25, 2013

Vitaliy reported that a per cpu HPET timer interrupt crashes the
system during hibernation. What happens is that the per cpu HPET timer
gets shut down when the nonboot cpus are stopped. When the nonboot
cpus are onlined again the HPET code sets up the MSI interrupt which
fires before the clock event device is registered. The event handler
is still set to hrtimer_interrupt, which then crashes the machine due
to highres mode not being active.

See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700333

There is no real good way to avoid that in the HPET code. The HPET
code alrady has a mechanism to detect spurious interrupts when event
handler == NULL for a similar reason.

We can handle that in the clockevent/tick layer and replace the
previous functional handler with a dummy handler like we do in
tick_setup_new_device().

The original clockevents code did this in clockevents_exchange_device(),
but that got removed by commit 7c1e7689 (clockevents: prevent
clockevent event_handler ending up handler_noop) which forgot to fix
it up in tick_shutdown(). Same issue with the broadcast device.
Reported-by: NVitaliy Fillipov <vitalif@yourcmc.ru>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org
Cc: 700333@bugs.debian.org
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

6f7a05d7

23 4月, 2013 1 次提交

timekeeping: Update tk->cycle_last in resume · 77c675ba

由 Thomas Gleixner 提交于 4月 22, 2013

commit 7ec98e15 (timekeeping: Delay update of clock->cycle_last)
forgot to update tk->cycle_last in the resume path. This results in a
stale value versus clock->cycle_last and prevents resume in the worst
case.
Reported-by: NJiri Slaby <jslaby@suse.cz>
Reported-and-tested-by: NBorislav Petkov <bp@alien8.de>
Acked-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Linux-pm mailing list <linux-pm@lists.linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304211648150.21884@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

77c675ba

18 4月, 2013 3 次提交

clockevents: Switch into oneshot mode even if broadcast registered late · c038c1c4

由 Stephen Boyd 提交于 4月 17, 2013

tick_oneshot_notify() is used to notify a particular CPU to try
to switch into oneshot mode after a oneshot capable tick device
is registered and tick_clock_notify() is used to notify all CPUs
to try to switch into oneshot mode after a high res clocksource
is registered. There is one caveat; if the tick devices suffer
from FEAT_C3_STOP we don't try to switch into oneshot mode unless
we have a oneshot capable broadcast device already registered.

If the broadcast device is registered after the tick devices that
have FEAT_C3_STOP we'll never try to switch into oneshot mode
again, causing us to be stuck in periodic mode forever. Avoid
this scenario by calling tick_clock_notify() after we register
the broadcast device so that we try to switch into oneshot mode
on all CPUs one more time.

[ tglx: Adopted to timers/core and added a comment ]
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Link: http://lkml.kernel.org/r/1366219566-29783-1-git-send-email-sboyd@codeaurora.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

c038c1c4

timer_list: Convert timer list to be a proper seq_file · b3956a89

由 Nathan Zimmer 提交于 3月 26, 2013

When running with 4096 cores attemping to read /proc/timer_list will fail
with an ENOMEM condition.  On a sufficantly large systems the total amount
of data is more then 4mb, so it won't fit into a single buffer.  The
failure can also occur on smaller systems when memory fragmentation is
high as reported by Dave Jones.

Convert /proc/timer_list to a proper seq_file with its own iterator.  This
is a little more complex given that we have to make two passes with two
separate headers.

sysrq_timer_list_show also needed to be updated to reflect the fact that
now timer_list_show only does one cpu at at time.
Signed-off-by: NNathan Zimmer <nzimmer@sgi.com>
Reported-by: NDave Jones <davej@redhat.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Link: http://lkml.kernel.org/r/1364345790-14577-3-git-send-email-nzimmer@sgi.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

b3956a89

timer_list: Split timer_list_show_tickdevices · 60cf7ea8

由 Nathan Zimmer 提交于 3月 26, 2013

Split timer_list_show_tickdevices() into the header printout and pull
the rest up to timer_list_show. This is a preparatory patch for
converting timer_list to a proper seqfile with its own iterator
Signed-off-by: NNathan Zimmer <nzimmer@sgi.com>
Reported-by: NDave Jones <davej@redhat.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Link: http://lkml.kernel.org/r/1364345790-14577-2-git-send-email-nzimmer@sgi.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

60cf7ea8

11 4月, 2013 1 次提交

timekeeping: Make sure to notify hrtimers when TAI offset changes · 4e8f8b34

由 John Stultz 提交于 4月 10, 2013

Now that we have CLOCK_TAI timers, make sure we notify hrtimer
code when TAI offset is changed.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/1365622909-953-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4e8f8b34

05 4月, 2013 12 次提交

timekeeping: Shorten seq_count region · ca4523cd

由 Thomas Gleixner 提交于 2月 21, 2013

Shorten the seqcount write hold region to the actual update of the
timekeeper and the related data (e.g vsyscall).

On a contemporary x86 system this reduces the maximum latencies on
Preempt-RT from 8us to 4us on the non-timekeeping cores.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

ca4523cd

timekeeping: Implement a shadow timekeeper · 48cdc135

由 Thomas Gleixner 提交于 2月 21, 2013

Use the shadow timekeeper to do the update_wall_time() adjustments and
then copy it over to the real timekeeper.

Keep the shadow timekeeper in sync when updating stuff outside of
update_wall_time().

This allows us to limit the timekeeper_seq hold time to the update of
the real timekeeper and the vsyscall data in the next patch.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

48cdc135

timekeeping: Delay update of clock->cycle_last · 7ec98e15

由 Thomas Gleixner 提交于 2月 21, 2013

For calculating the new timekeeper values store the new cycle_last
value in the timekeeper and update the clock->cycle_last just when we
actually update the new values.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

7ec98e15

timekeeping: Store cycle_last value in timekeeper struct as well · 14a3b6ab

由 Thomas Gleixner 提交于 2月 21, 2013

For implementing a shadow timekeeper and a split calculation/update
region we need to store the cycle_last value in the timekeeper and
update the value in the clocksource struct only in the update region.

Add the extra storage to the timekeeper.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

14a3b6ab

ntp: Remove ntp_lock, using the timekeeping locks to protect ntp state · a076b214

由 John Stultz 提交于 3月 22, 2013

In order to properly handle the NTP state in future changes to the
timekeeping lock management, this patch moves the management of
all of the ntp state under the timekeeping locks.

This allows us to remove the ntp_lock.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

a076b214

timekeeping: Simplify tai updating from do_adjtimex · 0b5154fb

由 John Stultz 提交于 3月 22, 2013

Since we are taking the timekeeping locks, just go ahead
and update any tai change directly, rather then dropping
the lock and calling a function that will just take it again.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

0b5154fb

timekeeping: Hold timekeepering locks in do_adjtimex and hardpps · 06c017fd

由 John Stultz 提交于 3月 22, 2013

In moving the NTP state to be protected by the timekeeping locks,
be sure to acquire the timekeeping locks prior to calling
ntp functions.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

06c017fd

timekeeping: Move ADJ_SETOFFSET to top level do_adjtimex() · cef90377

由 John Stultz 提交于 3月 22, 2013

Since ADJ_SETOFFSET adjusts the timekeeping state, process
it as part of the top level do_adjtimex() function in
timekeeping.c.

This avoids deadlocks that could occur once we change the
ntp locking rules.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

cef90377

ntp: Rework do_adjtimex to take timespec and tai arguments · 87ace39b

由 John Stultz 提交于 3月 22, 2013

In order to change the locking rules, we need to provide
the timespec and tai values rather then having the ntp
logic acquire these values itself.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

87ace39b

ntp: Move timex validation to timekeeping do_adjtimex call. · e4085693

由 John Stultz 提交于 3月 22, 2013

Move logic that does not need the ntp state to be done
in the timekeeping do_adjtimex() call.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

e4085693

ntp: Move do_adjtimex() and hardpps() functions to timekeeping.c · aa6f9c59

由 John Stultz 提交于 3月 22, 2013

In preparation for changing the ntp locking rules, move
do_adjtimex and hardpps accessor functions to timekeeping.c,
but keep the code logic in ntp.c.

This patch also introduces a ntp_internal.h file so timekeeping
specific interfaces of ntp.c can be more limitedly shared with
timekeeping.c.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

aa6f9c59

ntp: Split out timex validation from do_adjtimex · ad460967

由 John Stultz 提交于 3月 22, 2013

Split out the timex validation done in do_adjtimex into a separate
function. This will help simplify logic in following patches.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

ad460967

26 3月, 2013 1 次提交

timekeeping: __timekeeping_set_tai_offset can be static · dd5d70e8

由 Fengguang Wu 提交于 3月 25, 2013

Yet again, the kbuild test robot saves the day, noting
I left out defining __timekeeping_set_tai_offset as
static. It even sent me this patch.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

dd5d70e8

25 3月, 2013 1 次提交

tick: Change log level of NOHZ: local_softirq_pending message · cfea7d7e

由 Rado Vrbovsky 提交于 2月 08, 2013

The "NOHZ: local_softirq_pending" message is a largely informational
message.  This makes extra work for customers that have a policy of
investigating all kernel log messages logged at <= KERN_ERR log level.
This patch sets the message to a different log level.

[ tglx: Use pr_warn() ]
Signed-off-by: NRado Vrbovsky <rvrbovsk@redhat.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/2037057938.893524.1360345050772.JavaMail.root@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

cfea7d7e

23 3月, 2013 7 次提交

timekeeping: Split timekeeper_lock into lock and seqcount · 9a7a71b1

由 Thomas Gleixner 提交于 2月 21, 2013

We want to shorten the seqcount write hold time. So split the seqlock
into a lock and a seqcount.

Open code the seqwrite_lock in the places which matter and drop the
sequence counter update where it's pointless.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
[jstultz: Merge fixups from CLOCK_TAI collisions]
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

9a7a71b1

timekeeping: Move lock out of timekeeper struct · 7e40672d

由 Thomas Gleixner 提交于 2月 21, 2013

Make the lock a separate entity. Preparatory patch for shadow
timekeeper structure.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
[Merged with CLOCK_TAI changes]
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

7e40672d

timekeeping: Make jiffies_lock internal · eb93e4d9

由 Thomas Gleixner 提交于 2月 21, 2013

Nothing outside of the timekeeping core needs that lock.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

eb93e4d9

timekeeping: Calc stuff once · 23a9537a

由 Thomas Gleixner 提交于 2月 21, 2013

Calculate the cycle interval shifted value once. No functional change,
just makes the code more readable.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

23a9537a

hrtimer: Add hrtimer support for CLOCK_TAI · 90adda98

由 John Stultz 提交于 1月 21, 2013

Add hrtimer support for CLOCK_TAI, as well as posix timer interfaces.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

90adda98

timekeeping: Add CLOCK_TAI clockid · 1ff3c967

由 John Stultz 提交于 5月 03, 2012

This add a CLOCK_TAI clockid and the needed accessors.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

1ff3c967

timekeeping: Move TAI managment into timekeeping core from ntp · cc244dda

由 John Stultz 提交于 5月 03, 2012

Currently NTP manages the TAI offset. Since there's plans for a
CLOCK_TAI clockid, push the TAI management into the timekeeping
core.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

cc244dda

16 3月, 2013 1 次提交

timekeeping: utilize the suspend-nonstop clocksource to count suspended time · e445cf1c

由 Feng Tang 提交于 3月 12, 2013

There are some new processors whose TSC clocksource won't stop during
suspend. Currently, after system resumes, kernel will use persistent
clock or RTC to compensate the sleep time, but with these nonstop
clocksources, we could skip the special compensation from external
sources, and just use current clocksource for time recounting.

This can solve some time drift bugs caused by some not-so-accurate or
error-prone RTC devices.

The current way to count suspended time is first try to use the persistent
clock, and then try the RTC if persistent clock can't be used. This
patch will change the trying order to:
	suspend-nonstop clocksource -> persistent clock -> RTC

When counting the sleep time with nonstop clocksource, use an accurate way
suggested by Jason Gunthorpe to cover very large delta cycles.
Signed-off-by: NFeng Tang <feng.tang@intel.com>
[jstultz: Small optimization, avoiding re-reading the clocksource]
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

e445cf1c

13 3月, 2013 3 次提交

tick: Provide a check for a forced broadcast pending · eaa907c5

由 Thomas Gleixner 提交于 3月 06, 2013

On the CPU which gets woken along with the target CPU of the broadcast
the following happens:

  deep_idle()
			<-- spurious wakeup
  broadcast_exit()
    set forced bit
  
  enable interrupts
    
			<-- Nothing happens

  disable interrupts

  broadcast_enter()
			<-- Here we observe the forced bit is set
  deep_idle()

Now after that the target CPU of the broadcast runs the broadcast
handler and finds the other CPU in both the broadcast and the forced
mask, sends the IPI and stuff gets back to normal.

So it's not actually harmful, just more evidence for the theory, that
hardware designers have access to very special drug supplies.

Now there is no point in going back to deep idle just to wake up again
right away via an IPI. Provide a check which allows the idle code to
avoid the deep idle transition.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: LAK <linux-arm-kernel@lists.infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Jason Liu <liu.h.jason@gmail.com>
Link: http://lkml.kernel.org/r/20130306111537.565418308@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

eaa907c5

tick: Handle broadcast wakeup of multiple cpus · 989dcb64

由 Thomas Gleixner 提交于 3月 06, 2013

Some brilliant hardware implementations wake multiple cores when the
broadcast timer fires. This leads to the following interesting
problem:

CPU0				CPU1
wakeup from idle		wakeup from idle

leave broadcast mode		leave broadcast mode
 restart per cpu timer		 restart per cpu timer
 	     	 		go back to idle
handle broadcast
 (empty mask)			
				enter broadcast mode
				programm broadcast device
enter broadcast mode
programm broadcast device

So what happens is that due to the forced reprogramming of the cpu
local timer, we need to set a event in the future. Now if we manage to
go back to idle before the timer fires, we switch off the timer and
arm the broadcast device with an already expired time (covered by
forced mode). So in the worst case we repeat the above ping pong
forever.
					
Unfortunately we have no information about what caused the wakeup, but
we can check current time against the expiry time of the local cpu. If
the local event is already in the past, we know that the broadcast
timer is about to fire and send an IPI. So we mark ourself as an IPI
target even if we left broadcast mode and avoid the reprogramming of
the local cpu timer.

This still leaves the possibility that a CPU which is not handling the
broadcast interrupt is going to reach idle again before the IPI
arrives. This can't be solved in the core code and will be handled in
follow up patches.
Reported-by: NJason Liu <liu.h.jason@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: LAK <linux-arm-kernel@lists.infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Link: http://lkml.kernel.org/r/20130306111537.492045206@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

989dcb64

tick: Avoid programming the local cpu timer if broadcast pending · 26517f3e

由 Thomas Gleixner 提交于 3月 06, 2013

If the local cpu timer stops in deep idle, we arm the broadcast device
and get woken by an IPI. Now when we return from deep idle we reenable
the local cpu timer unconditionally before handling the IPI. But
that's a pointless exercise: the timer is already expired and the IPI
is on the way. And it's an expensive exercise as we use the forced
reprogramming mode so that we do not lose a timer event. This forced
reprogramming will loop at least once in the retry.

To avoid this reprogramming, we mark the cpu in a pending bit mask
before we send the IPI. Now when the IPI target cpu wakes up, it will
see the pending bit set and skip the reprogramming. The reprogramming
of the cpu local timer will happen in the IPI handler which runs the
cpu local timer interrupt function.
Reported-by: NJason Liu <liu.h.jason@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: LAK <linux-arm-kernel@lists.infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Link: http://lkml.kernel.org/r/20130306111537.431082074@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

26517f3e

08 3月, 2013 1 次提交

clockevents: Don't allow dummy broadcast timers · a7dc19b8

由 Mark Rutland 提交于 3月 07, 2013

Currently tick_check_broadcast_device doesn't reject clock_event_devices
with CLOCK_EVT_FEAT_DUMMY, and may select them in preference to real
hardware if they have a higher rating value. In this situation, the
dummy timer is responsible for broadcasting to itself, and the core
clockevents code may attempt to call non-existent callbacks for
programming the dummy, eventually leading to a panic.

This patch makes tick_check_broadcast_device always reject dummy timers,
preventing this problem.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jon Medhurst (Tixy) <tixy@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

a7dc19b8

07 3月, 2013 3 次提交

tick: Dynamically set broadcast irq affinity · d2348fb6

由 Daniel Lezcano 提交于 3月 02, 2013

When a cpu goes to a deep idle state where its local timer is
shutdown, it notifies the time frame work to use the broadcast timer
instead.  Unfortunately, the broadcast device could wake up any CPU,
including an idle one which is not concerned by the wake up at all. So
in the worst case an idle CPU will wake up to send an IPI to the CPU
whose timer expired.

Provide an opt-in feature CLOCK_EVT_FEAT_DYNIRQ which tells the core
that is should set the interrupt affinity of the broadcast interrupt
to the cpu which has the earliest expiry time. This avoids unnecessary
spurious wakeups and IPIs.

[ tglx: Adopted to cpumask rework, silenced an uninitialized warning,
  massaged changelog ]
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Cc: viresh.kumar@linaro.org
Cc: jacob.jun.pan@linux.intel.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: santosh.shilimkar@ti.com
Cc: linaro-kernel@lists.linaro.org
Cc: patches@linaro.org
Cc: rickard.andersson@stericsson.com
Cc: vincent.guittot@linaro.org
Cc: linus.walleij@stericsson.com
Cc: john.stultz@linaro.org
Link: http://lkml.kernel.org/r/1362219013-18173-3-git-send-email-daniel.lezcano@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

d2348fb6

tick: Pass broadcast device to tick_broadcast_set_event() · f9ae39d0

由 Daniel Lezcano 提交于 3月 02, 2013

Pass the broadcast timer to tick_broadcast_set_event() instead of
reevaluating tick_broadcast_device.evtdev.

[ tglx: Massaged changelog ]
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Cc: viresh.kumar@linaro.org
Cc: jacob.jun.pan@linux.intel.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: santosh.shilimkar@ti.com
Cc: linaro-kernel@lists.linaro.org
Cc: patches@linaro.org
Cc: rickard.andersson@stericsson.com
Cc: vincent.guittot@linaro.org
Cc: linus.walleij@stericsson.com
Cc: john.stultz@linaro.org
Link: http://lkml.kernel.org/r/1362219013-18173-2-git-send-email-daniel.lezcano@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f9ae39d0

tick: Convert broadcast cpu bitmaps to cpumask_var_t · b352bc1c

由 Thomas Gleixner 提交于 3月 05, 2013

Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130306111537.366394000@linutronix.de
Cc: Rusty Russell <rusty@rustcorp.com.au>

b352bc1c

22 2月, 2013 2 次提交

Revert "nohz: Make tick_nohz_irq_exit() irq safe" · af7bdbaf

由 Thomas Gleixner 提交于 2月 21, 2013

This reverts commit 351429b2e62b6545bb10c756686393f29ba268a1. The
extra local_irq_save() is not longer needed as the call site now
always calls with interrupts disabled.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linuxfoundation.org>

af7bdbaf

nohz: Make tick_nohz_irq_exit() irq safe · e5ab012c

由 Frederic Weisbecker 提交于 2月 20, 2013

As it stands, irq_exit() may or may not be called with
irqs disabled, depending on __ARCH_IRQ_EXIT_IRQS_DISABLED
that the arch can define.

It makes tick_nohz_irq_exit() unsafe. For example two
interrupts can race in tick_nohz_stop_sched_tick(): the inner
most one computes the expiring time on top of the timer list,
then it's interrupted right before reprogramming the
clock. The new interrupt enqueues a new timer list timer,
it reprogram the clock to take it into account and it exits.
The CPUs resumes the inner most interrupt and performs the clock
reprogramming without considering the new timer list timer.

This regression has been introduced by:
     280f0677
     ("nohz: Separate out irq exit and idle loop dyntick logic")

Let's fix it right now with the appropriate protections.

A saner long term solution will be to remove
__ARCH_IRQ_EXIT_IRQS_DISABLED and mandate that irq_exit() is called
with interrupts disabled.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: <stable@vger.kernel.org> #v3.2+
Link: http://lkml.kernel.org/r/1361373336-11337-1-git-send-email-fweisbec@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

e5ab012c