- 01 7月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Commit 0224cf4c (sched: Intoduce get_cpu_iowait_time_us()) broke things by not making sure preemption was indeed disabled by the callers of nr_iowait_cpu() which took the iowait value of the current cpu. This resulted in a heap of preempt warnings. Cure this by making nr_iowait_cpu() take a cpu number and fix up the callers to pass in the right number. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Jiri Slaby <jslaby@suse.cz> Cc: linux-pm@lists.linux-foundation.org LKML-Reference: <1277968037.1868.120.camel@laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 6月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Chris Wedgwood reports that 39c0cbe2 (sched: Rate-limit nohz) causes a serial console regression, unresponsiveness, and indeed it does. The reason is that the nohz code is skipped even when the tick was already stopped before the nohz_ratelimit(cpu) condition changed. Move the nohz_ratelimit() check to the other conditions which prevent long idle sleeps. Reported-by: NChris Wedgwood <cw@f00f.org> Tested-by: NBrian Bloniarz <bmb@athenacr.com> Signed-off-by: NMike Galbraith <efault@gmx.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg KH <gregkh@suse.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Jef Driesen <jefdriesen@telenet.be> LKML-Reference: <1276790557.27822.516.camel@twins> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 10 5月, 2010 7 次提交
-
-
由 John Stultz 提交于
How to pick good mult/shift pairs has always been difficult to describe to folks writing clocksource drivers, since it requires careful tradeoffs in adjustment accuracy vs overflow limits. Now, with the clocks_calc_mult_shift function, its much easier. However, not many clocksources have converted to using that function, and there is still the issue of the max interval length assumption being made by each clocksource driver independently. So this patch simplifies the registration process by having clocksources be registered with a hz/khz value and the registration function taking care of setting mult/shift. This should take most of the confusion out of writing a clocksource driver. Additionally it also keeps the shift size tradeoff (more accuracy vs longer possible nohz times) centralized so the timekeeping core can keep track of the assumptions being made. [ tglx: Coding style and comments fixed ] Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> LKML-Reference: <1273280858-30143-1-git-send-email-johnstul@us.ibm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Arjan van de Ven 提交于
For the ondemand cpufreq governor, it is desired that the iowait time is microaccounted in a similar way as idle time is. This patch introduces the infrastructure to account and expose this information via the get_cpu_iowait_time_us() function. [akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build] Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082523.284feab6@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Arjan van de Ven 提交于
Now that the only user of ts->idle_lastupdate is update_ts_time_stats(), the entire field can be eliminated. In update_ts_time_stats(), idle_lastupdate is first set to "now", and a few lines later, the only user is an if() statement that assigns a variable either to "now" or to ts->idle_lastupdate, which has the value of "now" at that point. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082439.2fab0b4f@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Arjan van de Ven 提交于
This patch folds the updating of the last_update_time into the update_ts_time_stats() function, and updates the callers. This allows for further cleanups that are done in the next patch. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082403.60072967@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Arjan van de Ven 提交于
Right now, get_cpu_idle_time_us() only reports the idle statistics upto the point the CPU entered last idle; not what is valid right now. This patch adds an update of the idle statistics to get_cpu_idle_time_us(), so that calling this function always returns statistics that are accurate at the point of the call. This includes resetting the start of the idle time for accounting purposes to avoid double accounting. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082323.2d2f1945@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Arjan van de Ven 提交于
Currently, two places update the idle statistics (and more to come later in this series). This patch creates a helper function for updating these statistics. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082245.163e67ed@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Arjan van de Ven 提交于
The exported function get_cpu_idle_time_us() has no comment describing it; add a kerneldoc comment Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082208.7cb721f0@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 4月, 2010 1 次提交
-
-
由 John Stultz 提交于
With the earlier logarithmic time accumulation patch, xtime will now always be within one "tick" of the current time, instead of possibly half a second off. This removes the need for the xtime_cache value, which always stored the time at the last interrupt, so this patch cleans that up removing the xtime_cache related code. This patch also addresses an issue with an earlier version of this change, where xtime_cache was normalizing xtime, which could in some cases be not valid (ie: tv_nsec == NSEC_PER_SEC). This is fixed by handling the edge case in update_wall_time(). Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Cc: Petr Titěra <P.Titera@century.cz> LKML-Reference: <1270589451-30773-1-git-send-email-johnstul@us.ibm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 24 3月, 2010 1 次提交
-
-
由 John Stultz 提交于
Now that no arches are accessing time_adjust directly, make it static. Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> LKML-Reference: <1268968769-19209-1-git-send-email-johnstul@us.ibm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 23 3月, 2010 1 次提交
-
-
由 John Stultz 提交于
The logarithmic accumulation done in the timekeeping has some overflow protection that limits the max shift value. That means it will take more then shift loops to accumulate all of the cycles. This causes the shift decrement to underflow, which causes the loop to never exit. The simplest fix would be simply to do a: if (shift) shift--; However that is not optimal, as we know the cycle offset is larger then the interval << shift, the above would make shift drop to zero, then we would be spinning for quite awhile accumulating at interval chunks at a time. Instead, this patch only decreases shift if the offset is smaller then cycle_interval << shift. This makes sure we accumulate using the largest chunks possible without overflowing tick_length, and limits the number of iterations through the loop. This issue was found and reported by Sonic Zhang, who also tested the fix. Many thanks your explanation and testing! Reported-by: NSonic Zhang <sonic.adi@gmail.com> Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Tested-by: NSonic Zhang <sonic.adi@gmail.com> LKML-Reference: <1268948850-5225-1-git-send-email-johnstul@us.ibm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 13 3月, 2010 1 次提交
-
-
由 Thomas Gleixner 提交于
The current logic which handles clock events programming failures can increase min_delta_ns unlimited and even can cause overflows. Sanitize it by: - prevent zero increase when min_delta_ns == 1 - limiting min_delta_ns to a jiffie - bail out if the jiffie limit is hit - add retries stats for /proc/timer_list so we can gather data Reported-by: NUwe Kleine-Koenig <u.kleine-koenig@pengutronix.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 12 3月, 2010 1 次提交
-
-
由 Mike Galbraith 提交于
Entering nohz code on every micro-idle is costing ~10% throughput for netperf TCP_RR when scheduling cross-cpu. Rate limiting entry fixes this, but raises ticks a bit. On my Q6600, an idle box goes from ~85 interrupts/sec to 128. The higher the context switch rate, the more nohz entry costs. With this patch and some cycle recovery patches in my tree, max cross cpu context switch rate is improved by ~16%, a large portion of which of which is this ratelimiting. Signed-off-by: NMike Galbraith <efault@gmx.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1268301003.6785.28.camel@marge.simson.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 3月, 2010 1 次提交
-
-
由 john stultz 提交于
Aaro Koskinen reported an issue in kernel.org bugzilla #15366, where on non-GENERIC_TIME systems, accessing /sys/devices/system/clocksource/clocksource0/current_clocksource results in an oops. It seems the timekeeper/clocksource rework missed initializing the curr_clocksource value in the !GENERIC_TIME case. Thanks to Aaro for reporting and diagnosing the issue as well as testing the fix! Reported-by: NAaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: stable@kernel.org LKML-Reference: <1267475683.4216.61.camel@localhost.localdomain> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 10 2月, 2010 1 次提交
-
-
由 Jason Wang 提交于
Export getboottime and monotonic_to_bootbased in order to let them could be used by following patch. Cc: stable@kernel.org Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 05 2月, 2010 2 次提交
-
-
由 Magnus Damm 提交于
Add a clocksource suspend callback. This callback can be used by the clocksource driver to shutdown and perform any kind of late suspend activities even though the clocksource driver itself is a non-sysdev driver. One example where this is useful is to fix the sh_cmt.c platform driver that today suspends using the platform bus and shuts down the clocksource too early. With this callback in place the sh_cmt driver will suspend using the clocksource and clockevent hooks and leave the platform device pm callbacks unused. Signed-off-by: NMagnus Damm <damm@opensource.se> Cc: Paul Mundt <lethal@linux-sh.org> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Magnus Damm 提交于
Pass the clocksource as an argument to the clocksource resume callback. Needed so we can point out which CMT channel the sh_cmt.c driver shall resume. Signed-off-by: NMagnus Damm <damm@opensource.se> Cc: john stultz <johnstul@us.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 29 1月, 2010 2 次提交
-
-
由 John Stultz 提交于
ntp.c doesn't need to access timekeeping internals directly, so change xtime references to use the get_seconds() timekeeping interface. Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Cc: richard@rsk.demon.co.uk LKML-Reference: <1264738844-21935-1-git-send-email-johnstul@us.ibm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 john stultz 提交于
Make time_esterror and time_maxerror static as no one uses them outside of ntp.c Signed-off-by: NJohn Stultz <johnstul@us.ibm.com> Cc: richard@rsk.demon.co.uk LKML-Reference: <1264719761.3437.47.camel@localhost.localdomain> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 26 1月, 2010 1 次提交
-
-
由 Thomas Gleixner 提交于
commit 0f8e8ef7 (clocksource: Simplify clocksource watchdog resume logic) introduced a potential kgdb dead lock. When the kernel is stopped by kgdb inside code which holds watchdog_lock then kgdb dead locks in clocksource_resume_watchdog(). clocksource_resume_watchdog() is called from kbdg via clocksource_touch_watchdog() to avoid that the clock source watchdog marks TSC unstable after the kernel has been stopped. Solve this by replacing spin_lock with a spin_trylock and just return in case the lock is held. Not resetting the watchdog might result in TSC becoming marked unstable, but that's an acceptable penalty for using kgdb. The timekeeping is anyway easily screwed up by kgdb when the system uses either jiffies or a clock source which wraps in short intervals (e.g. pm_timer wraps about every 4.6s), so we really do not have to worry about that occasional TSC marked unstable side effect. The second caller of clocksource_resume_watchdog() is clocksource_resume(). The trylock is safe here as well because the system is UP at this point, interrupts are disabled and nothing else can hold watchdog_lock(). Reported-by: NJason Wessel <jason.wessel@windriver.com> LKML-Reference: <1264480000-6997-4-git-send-email-jason.wessel@windriver.com> Cc: kgdb-bugreport@lists.sourceforge.net Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 18 1月, 2010 1 次提交
-
-
由 Xiaotian Feng 提交于
Marc reported that the BUG_ON in clockevents_notify() triggers on his system. This happens because the kernel tries to remove an active clock event device (used for broadcasting) from the device list. The handling of devices which can be used as per cpu device and as a global broadcast device is suboptimal. The simplest solution for now (and for stable) is to check whether the device is used as global broadcast device, but this needs to be revisited. [ tglx: restored the cpuweight check and massaged the changelog ] Reported-by: NMarc Dionne <marc.c.dionne@gmail.com> Tested-by: NMarc Dionne <marc.c.dionne@gmail.com> Signed-off-by: NXiaotian Feng <dfeng@redhat.com> LKML-Reference: <1262834564-13033-1-git-send-email-dfeng@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org
-
- 23 12月, 2009 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit 7bc7d637, as requested by John Stultz. Quoting John: "Petr Titěra reported an issue where he saw odd atime regressions with 2.6.33 where there were a full second worth of nanoseconds in the nanoseconds field. He also reviewed the time code and narrowed down the problem: unhandled overflow of the nanosecond field caused by rounding up the sub-nanosecond accumulated time. Details: * At the end of update_wall_time(), we currently round up the sub-nanosecond portion of accumulated time when storing it into xtime. This was added to avoid time inconsistencies caused when the sub-nanosecond portion was truncated when storing into xtime. Unfortunately we don't handle the possible second overflow caused by that rounding. * Previously the xtime_cache code hid this overflow by normalizing the xtime value when storing into the xtime_cache. * We could try to handle the second overflow after the rounding up, but since this affects the timekeeping's internal state, this would further complicate the next accumulation cycle, causing small errors in ntp steering. As much as I'd like to get rid of it, the xtime_cache code is known to work. * The correct fix is really to include the sub-nanosecond portion in the timekeeping accessor function, so we don't need to round up at during accumulation. This would greatly simplify the accumulation code. Unfortunately, we can't do this safely until the last three non-GENERIC_TIME arches (sparc32, arm, cris) are converted (those patches are in -mm) and we kill off the spots where arches set xtime directly. This is all 2.6.34 material, so I think reverting the xtime_cache change is the best approach for now. Many thanks to Petr for both reporting and finding the issue!" Reported-by: NPetr Titěra <P.Titera@century.cz> Requested-by: Njohn stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 12月, 2009 1 次提交
-
-
由 Rusty Russell 提交于
struct cpumask will be undefined soon with CONFIG_CPUMASK_OFFSTACK=y, to avoid them being declared on the stack. cpumask_bits() does what we want here (of course, this code is crap). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> To: Thomas Gleixner <tglx@linutronix.de>
-
- 16 12月, 2009 1 次提交
-
-
由 Barry Song 提交于
ktime will overflow from 03:14:07 UTC on Tuesday, 19 January 2038, ktime_add() in timecompare_update() will overflow a half earlier. As a result, wrong offset will be gotten, then cause some strange problems. Signed-off-by: NBarry Song <21cnbao@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Patrick Ohly <patrick.ohly@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: John Stultz <johnstul@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 12月, 2009 3 次提交
-
-
由 Thomas Gleixner 提交于
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NIngo Molnar <mingo@elte.hu>
-
- 11 12月, 2009 1 次提交
-
-
由 Thomas Gleixner 提交于
Xiaotian Feng triggered a list corruption in the clock events list on CPU hotplug and debugged the root cause. If a CPU registers more than one per cpu clock event device, then only the active clock event device is removed on CPU_DEAD. The unused devices are kept in the clock events device list. On CPU up the clock event devices are registered again, which means that we list_add an already enqueued list_head. That results in list corruption. Resolve this by removing all devices which are associated to the dead CPU on CPU_DEAD. Reported-by: NXiaotian Feng <dfeng@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NXiaotian Feng <dfeng@redhat.com> Cc: stable@kernel.org
-
- 10 12月, 2009 1 次提交
-
-
由 Thomas Gleixner 提交于
The hrtimer_interrupt hang logic adjusts min_delta_ns based on the execution time of the hrtimer callbacks. This is error-prone for virtual machines, where a guest vcpu can be scheduled out during the execution of the callbacks (and the callbacks themselves can do operations that translate to blocking operations in the hypervisor), which in can lead to large min_delta_ns rendering the system unusable. Replace the current heuristics with something more reliable. Allow the interrupt code to try 3 times to catch up with the lost time. If that fails use the total time spent in the interrupt handler to defer the next timer interrupt so the system can catch up with other things which got delayed. Limit that deferment to 100ms. The retry events and the maximum time spent in the interrupt handler are recorded and exposed via /proc/timer_list Inspired by a patch from Marcelo. Reported-by: NMichael Tokarev <mjt@tls.msk.ru> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NMarcelo Tosatti <mtosatti@redhat.com> Cc: kvm@vger.kernel.org
-
- 18 11月, 2009 1 次提交
-
-
由 H Hartley Sweeten 提交于
Include "tick-internal.h" in order to pick up the extern function prototype for clockevents_shutdown(). This quiets the following sparse build noise: warning: symbol 'clockevents_shutdown' was not declared. Should it be static? Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com> LKML-Reference: <BD79186B4FD85F4B8E60E381CAEE190901E24550@mi8nycmail19.Mi8.com> Reviewed-by: NYong Zhang <yong.zhang0@gmail.com> Cc: johnstul@us.ibm.com Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 17 11月, 2009 1 次提交
-
-
由 Lin Ming 提交于
Since commit 0a544198 "timekeeping: Move NTP adjusted clock multiplier to struct timekeeper" the clock multiplier of vsyscall is updated with the unmodified clock multiplier of the clock source and not with the NTP adjusted multiplier of the timekeeper. This causes user space observerable time warps: new CLOCK-warp maximum: 120 nsecs, 00000025c337c537 -> 00000025c337c4bf Add a new argument "mult" to update_vsyscall() and hand in the timekeeping internal NTP adjusted multiplier. Signed-off-by: NLin Ming <ming.m.lin@intel.com> Cc: "Zhang Yanmin" <yanmin_zhang@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Tony Luck <tony.luck@intel.com> LKML-Reference: <1258436990.17765.83.camel@minggr.sh.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 14 11月, 2009 7 次提交
-
-
由 Thomas Gleixner 提交于
powerpc grew a new warning due to the type change of clockevent->mult. The architectures which use parts of the generic time keeping infrastructure tripped over my wrong assumption that clocksource_register is only used when GENERIC_TIME=y. I should have looked and also I should have known better. These renitent Gaul villages are racking my nerves. Some serious deprecating is due. Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Jon Hunter 提交于
In the dynamic tick code, "max_delta_ns" (member of the "clock_event_device" structure) represents the maximum sleep time that can occur between timer events in nanoseconds. The variable, "max_delta_ns", is defined as an unsigned long which is a 32-bit integer for 32-bit machines and a 64-bit integer for 64-bit machines (if -m64 option is used for gcc). The value of max_delta_ns is set by calling the function "clockevent_delta2ns()" which returns a maximum value of LONG_MAX. For a 32-bit machine LONG_MAX is equal to 0x7fffffff and in nanoseconds this equates to ~2.15 seconds. Hence, the maximum sleep time for a 32-bit machine is ~2.15 seconds, where as for a 64-bit machine it will be many years. This patch changes the type of max_delta_ns to be "u64" instead of "unsigned long" so that this variable is a 64-bit type for both 32-bit and 64-bit machines. It also changes the maximum value returned by clockevent_delta2ns() to KTIME_MAX. Hence this allows a 32-bit machine to sleep for longer than ~2.15 seconds. Please note that this patch also changes "min_delta_ns" to be "u64" too and although this is unnecessary, it makes the patch simpler as it avoids to fixup all callers of clockevent_delta2ns(). [ tglx: changed "unsigned long long" to u64 as we use this data type through out the time code ] Signed-off-by: NJon Hunter <jon-hunter@ti.com> Cc: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1250617512-23567-3-git-send-email-jon-hunter@ti.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
The previous patch which limits the sleep time to the maximum deferment time of the time keeping clocksource has some limitations on SMP machines: if all CPUs are idle then for all CPUs the maximum sleep time is limited. Solve this by keeping track of which cpu had the do_timer() duty assigned last and limit the sleep time only for this cpu. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: Jon Hunter <jon-hunter@ti.com> Cc: John Stultz <johnstul@us.ibm.com>
-
由 Jon Hunter 提交于
The dynamic tick allows the kernel to sleep for periods longer than a single tick, but it does not limit the sleep time currently. In the worst case the kernel could sleep longer than the wrap around time of the time keeping clock source which would result in losing track of time. Prevent this by limiting it to the safe maximum sleep time of the current time keeping clock source. The value is calculated when the clock source is registered. [ tglx: simplified the code a bit and massaged the commit msg ] Signed-off-by: NJon Hunter <jon-hunter@ti.com> Cc: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1250617512-23567-2-git-send-email-jon-hunter@ti.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
On some archs local_softirq_pending() has a data type of unsigned long on others its unsigned int. Type cast it to (unsigned int) in the printk to avoid the compiler warning. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission>
-
由 Thomas Gleixner 提交于
MIPS has two functions to calculcate the mult/shift factors for clock sources and clock events at run time. ARM needs such functions as well. Implement a function which calculates the mult/shift factors based on the frequencies to which and from which is converted. The function also has a parameter to specify the minimum conversion range in seconds. This range is guaranteed not to produce a 64bit overflow when a value is multiplied with the calculated mult factor. The larger the conversion range the less becomes the conversion accuracy. Provide two inline wrappers which handle clock events and clock sources. For clock events the "from" frequency is nano seconds per second which corresponds to 1GHz and "to" is the device frequency. For clock sources "from" is the device frequency and "to" is nano seconds per second. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NMikael Pettersson <mikpe@it.uu.se> Acked-by: NRalf Baechle <ralf@linux-mips.org> Acked-by: NLinus Walleij <linus.walleij@stericsson.com> Cc: John Stultz <johnstul@us.ibm.com> LKML-Reference: <20091111134229.766673305@linutronix.de>
-
由 Thomas Gleixner 提交于
The mult and shift factors of clock events differ in their data type from those of clock sources for no reason. u32 is sufficient for both. shift is always <= 32 and mult is limited to 2^32-1 to avoid 64bit multiplication overflows in the conversion. Preparatory patch for a generic mult/shift factor calculation function. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NMikael Pettersson <mikpe@it.uu.se> Acked-by: NRalf Baechle <ralf@linux-mips.org> Acked-by: NLinus Walleij <linus.walleij@stericsson.com> Cc: John Stultz <johnstul@us.ibm.com> LKML-Reference: <20091111134229.725664788@linutronix.de>
-