- 28 6月, 2011 3 次提交
-
-
由 Tejun Heo 提交于
Currently, mtrr wants stop_machine functionality while a CPU is being brought up. As stop_machine() requires the calling CPU to be active, mtrr implements its own stop_machine using stop_one_cpu() on each online CPU. This doesn't only unnecessarily duplicate complex logic but also introduces a possibility of deadlock when it races against the generic stop_machine(). This patch implements stop_machine_from_inactive_cpu() to serve such use cases. Its functionality is basically the same as stop_machine(); however, it should be called from a CPU which isn't active and doesn't depend on working scheduling on the calling CPU. This is achieved by using busy loops for synchronization and open-coding stop_cpus queuing and waiting with direct invocation of fn() for local CPU inbetween. Signed-off-by: NTejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/20110623182056.982526827@sbsiddha-MOBL3.sc.intel.comSigned-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Tejun Heo 提交于
Refactor the queuing part of the stop cpus work from __stop_cpus() into queue_stop_cpus_work(). The reorganization is to help future improvements to stop_machine() and doesn't introduce any behavior difference. Signed-off-by: NTejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/20110623182056.897818337@sbsiddha-MOBL3.sc.intel.comSigned-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially happen in parallel with another system wide rendezvous using stop_machine(). This can lead to deadlock (The order in which works are queued can be different on different cpu's. Some cpu's will be running the first rendezvous handler and others will be running the second rendezvous handler. Each set waiting for the other set to join for the system wide rendezvous, leading to a deadlock). MTRR rendezvous sequence is not implemented using stop_machine() as this gets called both from the process context aswell as the cpu online paths (where the cpu has not come online and the interrupts are disabled etc). stop_machine() works with only online cpus. For now, take the stop_machine mutex in the MTRR rendezvous sequence that gets called from an online cpu (here we are in the process context and can potentially sleep while taking the mutex). And the MTRR rendezvous that gets triggered during cpu online doesn't need to take this stop_machine lock (as the stop_machine() already ensures that there is no cpu hotplug going on in parallel by doing get_online_cpus()) TBD: Pursue a cleaner solution of extending the stop_machine() infrastructure to handle the case where the calling cpu is still not online and use this for MTRR rendezvous sequence. fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008Reported-by: NVadim Kotelnikov <vadimuzzz@inbox.ru> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 22 6月, 2011 3 次提交
-
-
由 John Stultz 提交于
Toralf Förster and Richard Weinberger noted that if there is no RTC device, the alarm timers core prints out an annoying "ALARM timers will not wake from suspend" message. This warning has been removed in a previous patch, however the issue still remains: The original idea was to support alarm timers even if there was no rtc device, as long as the system didn't go into suspend. However, after further consideration, communicating to the application that alarmtimers are not fully functional seems like the better solution. So this patch makes it so we return -ENOTSUPP to any posix _ALARM clockid calls if there is no backing RTC device on the system. Further this changes the behavior where when there is no rtc device we will check for one on clock_getres, clock_gettime, timer_create, and timer_nsleep instead of on suspend. CC: Toralf Förster <toralf.foerster@gmx.de> CC: Richard Weinberger <richard@nod.at CC: Peter Zijlstra <peterz@infradead.org> CC: Thomas Gleixner <tglx@linutronix.de> Reported-by: NToralf Förster <toralf.foerster@gmx.de> Reported by: Richard Weinberger <richard@nod.at> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 John Stultz 提交于
The alarmtimers code currently picks a rtc device to use at late init time. However, if your rtc driver is loaded as a module, it may be registered after the alarmtimers late init code, leaving the alarmtimers nonfunctional. This patch moves the the rtcdevice selection to when we actually try to use it, allowing us to make use of rtc modules that may have been loaded at any point since bootup. CC: Thomas Gleixner <tglx@linutronix.de> CC: Meelis Roos <mroos@ut.ee> Reported-by: NMeelis Roos <mroos@ut.ee> Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
-
由 Michal Kubecek 提交于
When opening /dev/snapshot device, snapshot_open() creates memory bitmaps which are freed in snapshot_release(). But if any of the callbacks called by pm_notifier_call_chain() returns NOTIFY_BAD, open() fails, snapshot_release() is never called and bitmaps are not freed. Next attempt to open /dev/snapshot then triggers BUG_ON() check in create_basic_memory_bitmaps(). This happens e.g. when vmwatchdog module is active on s390x. Signed-off-by: NMichal Kubecek <mkubecek@suse.cz> Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
-
- 18 6月, 2011 1 次提交
-
-
由 David Howells 提交于
____call_usermodehelper() now erases any credentials set by the subprocess_inf::init() function. The problem is that commit 17f60a7d ("capabilites: allow the application of capability limits to usermode helpers") creates and commits new credentials with prepare_kernel_cred() after the call to the init() function. This wipes all keyrings after umh_keys_init() is called. The best way to deal with this is to put the init() call just prior to the commit_creds() call, and pass the cred pointer to init(). That means that umh_keys_init() and suchlike can modify the credentials _before_ they are published and potentially in use by the rest of the system. This prevents request_key() from working as it is prevented from passing the session keyring it set up with the authorisation token to /sbin/request-key, and so the latter can't assume the authority to instantiate the key. This causes the in-kernel DNS resolver to fail with ENOKEY unconditionally. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NEric Paris <eparis@redhat.com> Tested-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 6月, 2011 3 次提交
-
-
由 Takao Indoh 提交于
There is a problem that kdump(2nd kernel) sometimes hangs up due to a pending IPI from 1st kernel. Kernel panic occurs because IPI comes before call_single_queue is initialized. To fix the crash, rename init_call_single_data() to call_function_init() and call it in start_kernel() so that call_single_queue can be initialized before enabling interrupts. The details of the crash are: (1) 2nd kernel boots up (2) A pending IPI from 1st kernel comes when irqs are first enabled in start_kernel(). (3) Kernel tries to handle the interrupt, but call_single_queue is not initialized yet at this point. As a result, in the generic_smp_call_function_single_interrupt(), NULL pointer dereference occurs when list_replace_init() tries to access &q->list.next. Therefore this patch changes the name of init_call_single_data() to call_function_init() and calls it before local_irq_enable() in start_kernel(). Signed-off-by: NTakao Indoh <indou.takao@jp.fujitsu.com> Reviewed-by: NWANG Cong <xiyou.wangcong@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Milton Miller <miltonm@bga.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: kexec@lists.infradead.org Link: http://lkml.kernel.org/r/D6CBEE2F420741indou.takao@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Paul E. McKenney 提交于
The commit "use softirq instead of kthreads except when RCU_BOOST=y" just applied #ifdef in place. This commit is a cleanup that moves the newly #ifdef'ed code to the header file kernel/rcutree_plugin.h. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Thomas Gleixner 提交于
The clocksource watchdog code is interruptible and it has been observed that this can trigger false positives which disable the TSC. The reason is that an interrupt storm or a long running interrupt handler between the read of the watchdog source and the read of the TSC brings the two far enough apart that the delta is larger than the unstable treshold. Move both reads into a short interrupt disabled region to avoid that. Reported-and-tested-by: NVernon Mauery <vernux@us.ibm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org
-
- 16 6月, 2011 3 次提交
-
-
由 Paul E. McKenney 提交于
This patch #ifdefs RCU kthreads out of the kernel unless RCU_BOOST=y, thus eliminating context-switch overhead if RCU priority boosting has not been configured. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Josh Triplett 提交于
CONFIG_CONSTRUCTORS controls support for running constructor functions at kernel init time. According to commit b99b87f7 ("kernel: constructor support"), gcov (CONFIG_GCOV_KERNEL) needs this. However, CONFIG_CONSTRUCTORS currently defaults to y, with no option to disable it, and CONFIG_GCOV_KERNEL depends on it. Instead, default it to n and have CONFIG_GCOV_KERNEL select it, so that the normal case of CONFIG_GCOV_KERNEL=n will result in CONFIG_CONSTRUCTORS=n. Observed in the short list of =y values in a minimal kernel configuration. Signed-off-by: NJosh Triplett <josh@joshtriplett.org> Acked-by: NWANG Cong <xiyou.wangcong@gmail.com> Acked-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
The following crash was reported: > Call Trace: > [<ffffffff81139792>] mem_cgroup_from_task+0x15/0x17 > [<ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4 > [<ffffffff810493f3>] ? need_resched+0x23/0x2d > [<ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f > [<ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce > [<ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f > [<ffffffff81134024>] khugepaged+0x5da/0xfaf > [<ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b > [<ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13 > [<ffffffff81078625>] kthread+0xa8/0xb0 > [<ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4 > [<ffffffff814d5664>] kernel_thread_helper+0x4/0x10 > [<ffffffff814ce858>] ? retint_restore_args+0x13/0x13 > [<ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a What happens is that khugepaged tries to charge a huge page against an mm whose last possible owner has already exited, and the memory controller crashes when the stale mm->owner is used to look up the cgroup to charge. mm->owner has never been set to NULL with the last owner going away, but nobody cared until khugepaged came along. Even then it wasn't a problem because the final mmput() on an mm was forced to acquire and release mmap_sem in write-mode, preventing an exiting owner to go away while the mmap_sem was held, and until "692e0b35 mm: thp: optimize memcg charge in khugepaged", the memory cgroup charge was protected by mmap_sem in read-mode. Instead of going back to relying on the mmap_sem to enforce lifetime of a task, this patch ensures that mm->owner is properly set to NULL when the last possible owner is exiting, which the memory controller can handle just fine. [akpm@linux-foundation.org: tweak comments] Signed-off-by: NHugh Dickins <hughd@google.com> Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Reported-by: NHugh Dickins <hughd@google.com> Reported-by: NDave Jones <davej@redhat.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 6月, 2011 5 次提交
-
-
由 Steven Rostedt 提交于
On system boot up, the lowest_mask is initialized with an early_initcall(). But RT tasks may wake up on other early_initcall() callers before the lowest_mask is initialized, causing a system crash. Commit "d72bce0e rcu: Cure load woes" was the first commit to wake up RT tasks in early init. Before this commit this bug should not happen. Reported-by: NAndrew Theurer <habanero@linux.vnet.ibm.com> Tested-by: NAndrew Theurer <habanero@linux.vnet.ibm.com> Tested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Acked-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20110614223657.824872966@goodmis.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Hillf Danton 提交于
The RT preempt check tests the wrong task if NEED_RESCHED is set. It currently checks the local CPU task. It is supposed to check the task that is running on the runqueue we are about to wake another task on. Signed-off-by: NHillf Danton <dhillf@gmail.com> Reviewed-by: NYong Zhang <yong.zhang0@gmail.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20110614223657.450239027@goodmis.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Randy Dunlap 提交于
Fix kernel-doc warnings in signal.c: Warning(kernel/signal.c:2374): No description found for parameter 'nset' Warning(kernel/signal.c:2374): Excess function parameter 'set' description in 'sys_rt_sigprocmask' Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shaohua Li 提交于
Commit a26ac245(rcu: move TREE_RCU from softirq to kthread) introduced performance regression. In an AIM7 test, this commit degraded performance by about 40%. The commit runs rcu callbacks in a kthread instead of softirq. We observed high rate of context switch which is caused by this. Out test system has 64 CPUs and HZ is 1000, so we saw more than 64k context switch per second which is caused by RCU's per-CPU kthread. A trace showed that most of the time the RCU per-CPU kthread doesn't actually handle any callbacks, but instead just does a very small amount of work handling grace periods. This means that RCU's per-CPU kthreads are making the scheduler do quite a bit of work in order to allow a very small amount of RCU-related processing to be done. Alex Shi's analysis determined that this slowdown is due to lock contention within the scheduler. Unfortunately, as Peter Zijlstra points out, the scheduler's real-time semantics require global action, which means that this contention is inherent in real-time scheduling. (Yes, perhaps someone will come up with a workaround -- otherwise, -rt is not going to do well on large SMP systems -- but this patch will work around this issue in the meantime. And "the meantime" might well be forever.) This patch therefore re-introduces softirq processing to RCU, but only for core RCU work. RCU callbacks are still executed in kthread context, so that only a small amount of RCU work runs in softirq context in the common case. This should minimize ksoftirqd execution, allowing us to skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Tested-by: N"Alex,Shi" <alex.shi@intel.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Paul E. McKenney 提交于
Make the functions creating the kthreads wake them up. Leverage the fact that the per-node and boost kthreads can run anywhere, thus dispensing with the need to wake them up once the incoming CPU has gone fully online. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: NDaniel J Blueman <daniel.blueman@gmail.com>
-
- 10 6月, 2011 1 次提交
-
-
由 Jesper Juhl 提交于
In kernel/irq/manage.c::irq_set_irq_wake() we call irq_get_desc_buslock() which may return NULL, but the code dereferences the result unconditionally. irq_set_irq_wake() has lots of callers - I checked a few and I couldn't find anything that guarantees that they won't call it with some input that will cause irq_get_desc_buslock() to return NULL, so I think it's a good thing to test and -EINVAL was the most sane error code in this situation that I could think of. Not all callers test the return value of irq_set_irq_wake(), but those that do take != 0 to mean error as far as I can see, so they should be fine. I guess those that don't test actually should, but that's a different issue. Signed-off-by: NJesper Juhl <jj@chaosbits.net> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1106092300360.17868@swampdragon.chaosbits.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 09 6月, 2011 1 次提交
-
-
由 Steven Rostedt 提交于
The fix to fix the printk_formats of modules broke the printk_formats of trace_printks in the kernel. The update of what to show via the seq_file was only updated if the passed in fmt was NULL, which happens only on the first iteration. The result was showing the first format every time instead of iterating through the available formats. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 08 6月, 2011 2 次提交
-
-
由 Steven Rostedt 提交于
Revert the commit that removed the disabling of interrupts around the initial modifying of mcount callers to nops, and update the comment. The original comment was outdated and stated that the interrupts were being disabled to prevent kstop machine, which was required with the old ftrace daemon, but was no longer the case. What the comment failed to mention was that interrupts needed to be disabled to keep interrupts from preempting the modifying of the code and then executing the code that was partially modified. Revert the commit and update the comment. Reported-by: NRichard W.M. Jones <rjones@redhat.com> Tested-by: NRichard W.M. Jones <rjones@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
With gcc 4.6, the self test kprobe function: kprobe_trace_selftest_target() is optimized such that kallsyms does not list it. The kprobes test uses this function to insert a probe and test it. But it will fail the test if the function is not listed in kallsyms. Adding a __used annotation keeps the symbol in the kallsyms table. Suggested-by: NDavid Daney <ddaney@caviumnetworks.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 07 6月, 2011 3 次提交
-
-
由 Peter Zijlstra 提交于
Sergey reported a CONFIG_PROVE_RCU warning in push_rt_task where set_task_cpu() was called with both relevant rq->locks held, which should be sufficient for running tasks since holding its rq->lock will serialize against sched_move_task(). Update the comments and fix the task_group() lockdep test. Reported-and-tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1307115427.2353.3456.camel@twinsSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
The main lock_is_held() user is lockdep_assert_held(), avoid false assertions in lockdep_off() sections by unconditionally reporting the lock is taken. [ the reason this is important is a lockdep_assert_held() in ttwu() which triggers a warning under lockdep_off() as in printk() which can trigger another wakeup and lock up due to spinlock recursion, as reported and heroically debugged by Arne Jansen ] Reported-and-tested-by: NArne Jansen <lists@die-jansens.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1307398759.2497.966.camel@laptopSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 GuoWen Li 提交于
kernel/trace/ftrace.c: In function 'ftrace_regex_write.clone.15': kernel/trace/ftrace.c:2743:6: warning: 'ret' may be used uninitialized in this function Signed-off-by: NGuoWen Li <guowen.li.linux@gmail.com> Link: http://lkml.kernel.org/r/201106011918.47939.guowen.li.linux@gmail.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 04 6月, 2011 1 次提交
-
-
由 Vince Weaver 提交于
Turns out that distro packages use this file as an indicator of the perf event subsystem - this is easier to check for from scripts than the existence of the system call. This is easy enough to keep around for the kernel, so add a comment to make sure it stays so. Signed-off-by: NVince Weaver <vweaver1@eecs.utk.edu> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: acme@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1106031751170.29381@cl320.eecs.utk.eduSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 6月, 2011 6 次提交
-
-
There is an optimization which does not update the timer if the timer was pending and the expiration time was unchanged. Since commit 3bbb9ec9 ("timers: Introduce the concept of timer slack for legacy timers") this optimization is no longer applied for timers where the expiration time got extended due to the slack value. So we need to check again after the expiration time might have been updated. [ tglx: Made it a single check by applying slack first and sorting out the slack = 0 value (all timeouts < 256 jiffies) early ] Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc> Link: http://lkml.kernel.org/r/20110521105828.GA29442@Chamillionaire.breakpoint.ccSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Mark Brown 提交于
When irq_alloc_descs() is called with no base IRQ specified then it will search for a range of IRQs starting from a specified base address. In the case where an IRQ is specified it still does this search in order to ensure that none of the requested range is already allocated and it still uses the from parameter to specify the base for the search. This means that in the case where a base is specified but from is zero (which is reasonable as any IRQ number is in the range specified by a zero from) the function will get confused and try to allocate the first suitably sized block of free IRQs it finds. Instead use a specified IRQ as the base address for the search, and insist that any from that is specified can support that IRQ. Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com> Link: http://lkml.kernel.org/r/1307037313-15733-1-git-send-email-broonie@opensource.wolfsonmicro.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Linus Walleij 提交于
The genirq changes are initializing descriptors for sparse IRQs quite differently from how non-sparse (stacked?) IRQs are initialized, with the effect that on my platform all IRQs are default-disabled on sparse IRQs and default-enabled if non-sparse IRQs are used, crashing some GPIO driver. Fix this by refactoring the non-sparse IRQs to use the same descriptor init function as the sparse IRQs. Signed-off: Linus Walleij <linus.walleij@linaro.org> Link: http://lkml.kernel.org/r/1306858479-16622-1-git-send-email-linus.walleij@stericsson.com Cc: stable@kernel.org # 2.6.39 Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
The detection of spurios interrupts is currently limited to first level handler. In force-threaded mode we never notice if the threaded irq does not feel responsible. This patch catches the return value of the threaded handler and forwards it to the spurious detector. If the primary handler returns only IRQ_WAKE_THREAD then the spourious detector ignores it because it gets called again from the threaded handler. [ tglx: Report the erroneous return value early and bail out ] Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc> Link: http://lkml.kernel.org/r/1306824972-27067-2-git-send-email-sebastian@breakpoint.ccSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
In forced threaded mode (or with an explicit threaded handler) we only see the primary handler, but not the threaded handler. Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc> Link: http://lkml.kernel.org/r/1306824972-27067-1-git-send-email-sebastian@breakpoint.ccSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
For UP it's stupid to request an initialized cpumask for the clock event devices. Though we need the mask set even on UP to avoid a horrible ifdeffery especially in the broadcast code. For SMP we can at least try to survive with a warning and set the cpumask of the cpu we're running on. That gives a decent chance to bring the machine up and retrieve the debug info. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Linus Walleij <linus.walleij@linaro.org Cc: Lee Jones <lee.jones@linaro.org> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> Cc: Stephen Boyd <sboyd@codeaurora.org>
-
- 31 5月, 2011 4 次提交
-
-
由 Peter Zijlstra 提交于
Ben changed the cgroup API in commit f780bdb7 (cgroups: add per-thread subsystem callbacks) in an incompatible way, but forgot to convert the perf cgroup bits. Avoid compile warnings and runtime splats and convert perf too ;-) Acked-by: NBen Blum <bblum@andrew.cmu.edu> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1306767651.1200.2990.camel@twinsSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
While looking over the code I found that with the ttwu rework the nr_wakeups_migrate test broke since we now switch cpus prior to calling ttwu_stat(), hence the test is always true. Cure this by passing the migration state in wake_flags. Also move the whole test under CONFIG_SMP, its hard to migrate tasks on UP :-) Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-pwwxl7gdqs5676f1d4cx6pj7@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Markus reported that commit 317f3941 ("sched: Move the second half of ttwu() to the remote cpu") caused some accounting funnies on his AMD Phenom II X4, such as weird 'top' results. It turns out that this is due to non-synced TSC and the queued remote wakeups stopped coupeling the two relevant cpu clocks, which leads to wakeups seeing time jumps, which in turn lead to skewed runtime stats. Add an explicit call to sched_clock_cpu() to couple the per-cpu clocks to restore the normal flow of time. Reported-and-tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1306835745.2353.3.camel@twinsSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Commit cc3ce517 (rcu: Start RCU kthreads in TASK_INTERRUPTIBLE state) fudges a sleeping task' state, resulting in the scheduler seeing a TASK_UNINTERRUPTIBLE task going to sleep, but a TASK_INTERRUPTIBLE task waking up. The result is unbalanced load calculation. The problem that patch tried to address is that the RCU threads could stay in UNINTERRUPTIBLE state for quite a while and triggering the hung task detector due to on-demand wake-ups. Cure the problem differently by always giving the tasks at least one wake-up once the CPU is fully up and running, this will kick them out of the initial UNINTERRUPTIBLE state and into the regular INTERRUPTIBLE wait state. [ The alternative would be teaching kthread_create() to start threads as INTERRUPTIBLE but that needs a tad more thought. ] Reported-by: NDamien Wyart <damien.wyart@free.fr> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPaul E. McKenney <paul.mckenney@linaro.org> Link: http://lkml.kernel.org/r/1306755291.1200.2872.camel@twinsSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 5月, 2011 1 次提交
-
-
由 Linus Torvalds 提交于
Thomas Gleixner reports that we now have a boot crash triggered by CONFIG_CPUMASK_OFFSTACK=y: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c11ae035>] find_next_bit+0x55/0xb0 Call Trace: [<c11addda>] cpumask_any_but+0x2a/0x70 [<c102396b>] flush_tlb_mm+0x2b/0x80 [<c1022705>] pud_populate+0x35/0x50 [<c10227ba>] pgd_alloc+0x9a/0xf0 [<c103a3fc>] mm_init+0xec/0x120 [<c103a7a3>] mm_alloc+0x53/0xd0 which was introduced by commit de03c72c ("mm: convert mm->cpu_vm_cpumask into cpumask_var_t"), and is due to wrong ordering of mm_init() vs mm_init_cpumask Thomas wrote a patch to just fix the ordering of initialization, but I hate the new double allocation in the fork path, so I ended up instead doing some more radical surgery to clean it all up. Reported-by: NThomas Gleixner <tglx@linutronix.de> Reported-by: NIngo Molnar <mingo@elte.hu> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 5月, 2011 1 次提交
-
-
由 Tim Chen 提交于
Thanks to the reviews and comments by Rafael, James, Mark and Andi. Here's version 2 of the patch incorporating your comments and also some update to my previous patch comments. I noticed that before entering idle state, the menu idle governor will look up the current pm_qos target value according to the list of qos requests received. This look up currently needs the acquisition of a lock to access the list of qos requests to find the qos target value, slowing down the entrance into idle state due to contention by multiple cpus to access this list. The contention is severe when there are a lot of cpus waking and going into idle. For example, for a simple workload that has 32 pair of processes ping ponging messages to each other, where 64 cpu cores are active in test system, I see the following profile with 37.82% of cpu cycles spent in contention of pm_qos_lock: - 37.82% swapper [kernel.kallsyms] [k] _raw_spin_lock_irqsave - _raw_spin_lock_irqsave - 95.65% pm_qos_request menu_select cpuidle_idle_call - cpu_idle 99.98% start_secondary A better approach will be to cache the updated pm_qos target value so reading it does not require lock acquisition as in the patch below. With this patch the contention for pm_qos_lock is removed and I saw a 2.2X increase in throughput for my message passing workload. cc: stable@kernel.org Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJames Bottomley <James.Bottomley@suse.de> Acked-by: Nmark gross <markgross@thegnar.org> Signed-off-by: NLen Brown <len.brown@intel.com>
-
- 28 5月, 2011 2 次提交
-
-
由 Paul E. McKenney 提交于
Upon creation, kthreads are in TASK_UNINTERRUPTIBLE state, which can result in softlockup warnings. Because some of RCU's kthreads can legitimately be idle indefinitely, start them in TASK_INTERRUPTIBLE state in order to avoid those warnings. Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
It is not necessary to use waitqueues for the RCU kthreads because we always know exactly which thread is to be awakened. In addition, wake_up() only issues an actual wakeup when there is a thread waiting on the queue, which was why there was an extra explicit wake_up_process() to get the RCU kthreads started. Eliminating the waitqueues (and wake_up()) in favor of wake_up_process() eliminates the need for the initial wake_up_process() and also shrinks the data structure size a bit. The wakeup logic is placed in a new rcu_wait() macro. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-