- 09 2月, 2011 1 次提交
-
-
由 Thomas Gleixner 提交于
ksoftirqd() calls do_softirq() which switches stacks on several architectures. That makes no sense at all. ksoftirqd's stack is sufficient. Call __do_softirq() directly. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: NDavid Miller <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Reviewed-by: NFrank Rowand <frank.rowand@am.sony.com> LKML-Reference: <alpine.LFD.2.00.1102021704530.31804@localhost6.localdomain6>
-
- 14 1月, 2011 1 次提交
-
-
由 Amerigo Wang 提交于
For arch which needs USE_GENERIC_SMP_HELPERS, it has to select USE_GENERIC_SMP_HELPERS, rather than leaving a choice to user, since they don't provide their own implementions. Also, move on_each_cpu() to kernel/smp.c, it is strange to put it in kernel/softirq.c. For arch which doesn't use USE_GENERIC_SMP_HELPERS, e.g. blackfin, only on_each_cpu() is compiled. Signed-off-by: NAmerigo Wang <amwang@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 1月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
Function-scope statics are discouraged because they are easily overlooked and can cause subtle bugs/races due to their global (non-SMP safe) nature. Linus noticed that we did this for sched_param - at minimum make the const. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: Message-ID: <AANLkTinotRxScOHEb0HgFgSpGPkq_6jKTv5CfvnQM=ee@mail.gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 17 12月, 2010 1 次提交
-
-
由 Christoph Lameter 提交于
__get_cpu_var() can be replaced with this_cpu_read and will then use a single read instruction with implied address calculation to access the correct per cpu instance. However, the address of a per cpu variable passed to __this_cpu_read() cannot be determined (since it's an implied address conversion through segment prefixes). Therefore apply this only to uses of __get_cpu_var where the address of the variable is not used. Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Hugh Dickins <hughd@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NChristoph Lameter <cl@linux.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 23 10月, 2010 1 次提交
-
-
由 KOSAKI Motohiro 提交于
Andrew Morton pointed out almost all sched_setscheduler() callers are using fixed parameters and can be converted to static. It reduces runtime memory use a little. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reported-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NJames Morris <jmorris@namei.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 10月, 2010 1 次提交
-
-
由 Thomas Gleixner 提交于
With the addition of trace_softirq_raise() the softirq tracepoint got even more convoluted. Why the tracepoints take two pointers to assign an integer is beyond my comprehension. But adding an extra case which treats the first pointer as an unsigned long when the second pointer is NULL including the back and forth type casting is just horrible. Convert the softirq tracepoints to take a single unsigned int argument for the softirq vector number and fix the call sites. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> LKML-Reference: <alpine.LFD.2.00.1010191428560.6815@localhost6.localdomain6> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: mathieu.desnoyers@efficios.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
-
- 19 10月, 2010 3 次提交
-
-
由 Venkatesh Pallipadi 提交于
When CPU is idle and on first interrupt, irq_enter calls tick_check_idle() to notify interruption from idle. But, there is a problem if this call is done after __irq_enter, as all routines in __irq_enter may find stale time due to yet to be done tick_check_idle. Specifically, trace calls in __irq_enter when they use global clock and also account_system_vtime change in this patch as it wants to use sched_clock_cpu() to do proper irq timing. But, tick_check_idle was moved after __irq_enter intentionally to prevent problem of unneeded ksoftirqd wakeups by the commit ee5f80a9: irq: call __irq_enter() before calling the tick_idle_check Impact: avoid spurious ksoftirqd wakeups Moving tick_check_idle() before __irq_enter and wrapping it with local_bh_enable/disable would solve both the problems. Fixed-by: NYong Zhang <yong.zhang0@gmail.com> Signed-off-by: NVenkatesh Pallipadi <venki@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1286237003-12406-9-git-send-email-venki@google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Venkatesh Pallipadi 提交于
To account softirq time cleanly in scheduler, we need to identify whether softirq is invoked in ksoftirqd context or softirq at hardirq tail context. Add PF_KSOFTIRQD for that purpose. As all PF flag bits are currently taken, create space by moving one of the infrequently used bits (PF_THREAD_BOUND) down in task_struct to be along with some other state fields. Signed-off-by: NVenkatesh Pallipadi <venki@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1286237003-12406-4-git-send-email-venki@google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Venkatesh Pallipadi 提交于
Peter Zijlstra found a bug in the way softirq time is accounted in VIRT_CPU_ACCOUNTING on this thread: http://lkml.indiana.edu/hypermail//linux/kernel/1009.2/01366.html The problem is, softirq processing uses local_bh_disable internally. There is no way, later in the flow, to differentiate between whether softirq is being processed or is it just that bh has been disabled. So, a hardirq when bh is disabled results in time being wrongly accounted as softirq. Looking at the code a bit more, the problem exists in !VIRT_CPU_ACCOUNTING as well. As account_system_time() in normal tick based accouting also uses softirq_count, which will be set even when not in softirq with bh disabled. Peter also suggested solution of using 2*SOFTIRQ_OFFSET as irq count for local_bh_{disable,enable} and using just SOFTIRQ_OFFSET while softirq processing. The patch below does that and adds API in_serving_softirq() which returns whether we are currently processing softirq or not. Also changes one of the usages of softirq_count in net/sched/cls_cgroup.c to in_serving_softirq. Looks like many usages of in_softirq really want in_serving_softirq. Those changes can be made individually on a case by case basis. Signed-off-by: NVenkatesh Pallipadi <venki@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1286237003-12406-2-git-send-email-venki@google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 10月, 2010 2 次提交
-
-
由 Thomas Gleixner 提交于
This function should have not been there in the first place. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
sparse irq sets up NR_IRQS_LEGACY irq descriptors and archs then go ahead and allocate more. Use the unused return value of arch_probe_nr_irqs() to let the architecture return the number of early allocations. Fix up all users. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NIngo Molnar <mingo@elte.hu>
-
- 22 9月, 2010 1 次提交
-
-
由 Thomas Gleixner 提交于
No users outside of kernel/softirq.c Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 05 6月, 2010 1 次提交
-
-
由 Akinobu Mita 提交于
The commit 80b5184c ("kernel/: convert cpu notifier to return encapsulate errno value") changed the return value of cpu notifier callbacks. Those callbacks don't return NOTIFY_BAD on failures anymore. But there are a few callbacks which are called directly at init time and checking the return value. I forgot to change BUG_ON checking by the direct callers in the commit. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 5月, 2010 1 次提交
-
-
由 Akinobu Mita 提交于
By the previous modification, the cpu notifier can return encapsulate errno value. This converts the cpu notifiers for kernel/*.c Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 5月, 2010 1 次提交
-
-
由 Paul E. McKenney 提交于
The addition of preemptible RCU to treercu resulted in a bit of confusion and inefficiency surrounding the handling of context switches for RCU-sched and for RCU-preempt. For RCU-sched, a context switch is a quiescent state, pure and simple, just like it always has been. For RCU-preempt, a context switch is in no way a quiescent state, but special handling is required when a task blocks in an RCU read-side critical section. However, the callout from the scheduler and the outer loop in ksoftirqd still calls something named rcu_sched_qs(), whose name is no longer accurate. Furthermore, when rcu_check_callbacks() notes an RCU-sched quiescent state, it ends up unnecessarily (though harmlessly, aside from the performance hit) enqueuing the current task if it happens to be running in an RCU-preempt read-side critical section. This not only increases the maximum latency of scheduler_tick(), it also needlessly increases the overhead of the next outermost rcu_read_unlock() invocation. This patch addresses this situation by separating the notion of RCU's context-switch handling from that of RCU-sched's quiescent states. The context-switch handling is covered by rcu_note_context_switch() in general and by rcu_preempt_note_context_switch() for preemptible RCU. This permits rcu_sched_qs() to handle quiescent states and only quiescent states. It also reduces the maximum latency of scheduler_tick(), though probably by much less than a microsecond. Finally, it means that tasks within preemptible-RCU read-side critical sections avoid incurring the overhead of queuing unless there really is a context switch. Suggested-by: NLai Jiangshan <laijs@cn.fujitsu.com> Acked-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 04 2月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
hrtimers callbacks are always done from hardirq context, either the jiffy tick interrupt or the hrtimer device interrupt. [ there is currently one exception that can still call a hrtimer callback from softirq, but even in that case this will still work correctly. ] Reported-by: NWei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yury Polyanskiy <ypolyans@princeton.edu> Tested-by: NWei Yongjun <yjwei@cn.fujitsu.com> Acked-by: NDavid S. Miller <davem@davemloft.net> LKML-Reference: <1265120401.24455.306.camel@laptop> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 02 11月, 2009 1 次提交
-
-
由 Lai Jiangshan 提交于
Currently, rcu_irq_exit() is invoked only for CONFIG_NO_HZ, while rcu_irq_enter() is invoked unconditionally. This patch moves rcu_irq_exit() out from under CONFIG_NO_HZ so that the calls are balanced. This patch has no effect on the behavior of the kernel because both rcu_irq_enter() and rcu_irq_exit() are empty for !CONFIG_NO_HZ, but the code is easier to understand if the calls are obviously balanced in all cases. Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12567428891605-git-send-email-> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 10月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
This patch updates percpu related symbols under kernel/ and mm/ such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * kernel/lockdep.c: s/lock_stats/cpu_lock_stats/ * kernel/sched.c: s/init_rq_rt/init_rt_rq_var/ (any better idea?) s/sched_group_cpus/sched_groups/ * kernel/softirq.c: s/ksoftirqd/run_ksoftirqd/a * kernel/softlockup.c: s/(*)_timestamp/softlockup_\1_ts/ s/watchdog_task/softlockup_watchdog/ s/timestamp/ts/ for local variables * kernel/time/timer_stats: s/lookup_lock/tstats_lookup_lock/ * mm/slab.c: s/reap_work/slab_reap_work/ s/reap_node/slab_reap_node/ * mm/vmstat.c: local variable changed to avoid collision with vmstat_work Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: N(slab/vmstat) Christoph Lameter <cl@linux-foundation.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Piggin <npiggin@suse.de>
-
- 18 9月, 2009 1 次提交
-
-
由 Li Zefan 提交于
With BLOCK_IOPOLL_SOFTIRQ added, softirq_to_name[] and show_softirq_name() needs to be updated. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4AB20398.8070209@cn.fujitsu.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 23 8月, 2009 1 次提交
-
-
由 Paul E. McKenney 提交于
Make RCU-sched, RCU-bh, and RCU-preempt be underlying implementations, with "RCU" defined in terms of one of the three. Update the outdated rcu_qsctr_inc() names, as these functions no longer increment anything. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josht@linux.vnet.ibm.com Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org LKML-Reference: <12509746132696-git-send-email-> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 22 7月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
commit ca109491 (hrtimer: removing all ur callback modes) moved all hrtimer callbacks into hard interrupt context when high resolution timers are active. That breaks code which relied on the assumption that the callback happens in softirq context. Provide a generic infrastructure which combines tasklets and hrtimers together to provide an in-softirq hrtimer experience. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: torvalds@linux-foundation.org Cc: kaber@trash.net Cc: David Miller <davem@davemloft.net> LKML-Reference: <1248265724.27058.1366.camel@twins> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 19 6月, 2009 1 次提交
-
-
由 Keika Kobayashi 提交于
Statistics for softirq doesn't exist. It will be helpful like statistics for interrupts. This patch introduces counting the number of softirq, which will be exported in /proc/softirqs. When softirq handler consumes much CPU time, /proc/stat is like the following. $ while :; do cat /proc/stat | head -n1 ; sleep 10 ; done cpu 88 0 408 739665 583 28 2 0 0 cpu 450 0 1090 740970 594 28 1294 0 0 ^^^^ softirq In such a situation, /proc/softirqs shows us which softirq handler is invoked. We can see the increase rate of softirqs. <before> $ cat /proc/softirqs CPU0 CPU1 CPU2 CPU3 HI 0 0 0 0 TIMER 462850 462805 462782 462718 NET_TX 0 0 0 365 NET_RX 2472 2 2 40 BLOCK 0 0 381 1164 TASKLET 0 0 0 224 SCHED 462654 462689 462698 462427 RCU 3046 2423 3367 3173 <after> $ cat /proc/softirqs CPU0 CPU1 CPU2 CPU3 HI 0 0 0 0 TIMER 463361 465077 465056 464991 NET_TX 53 0 1 365 NET_RX 3757 2 2 40 BLOCK 0 0 398 1170 TASKLET 0 0 0 224 SCHED 463074 464318 464612 463330 RCU 3505 2948 3947 3673 When CPU TIME of softirq is high, the rates of increase is the following. TIMER : 220/sec : CPU1-3 NET_TX : 5/sec : CPU0 NET_RX : 120/sec : CPU0 SCHED : 40-200/sec : all CPU RCU : 45-58/sec : all CPU The rates of increase in an idle mode is the following. TIMER : 250/sec SCHED : 250/sec RCU : 2/sec It seems many softirqs for receiving packets and rcu are invoked. This gives us help for checking system. Signed-off-by: NKeika Kobayashi <kobayashi.kk@ncos.nec.co.jp> Reviewed-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 6月, 2009 1 次提交
-
-
由 Vegard Nossum 提交于
Rationale: kmemcheck needs to be able to schedule a tasklet without touching any dynamically allocated memory _at_ _all_ (since that would lead to a recursive page fault). This tasklet is used for writing the error reports to the kernel log. The new scheduling function avoids touching any other tasklets by inserting the new tasklist as the head of the "tasklet_hi" list instead of on the tail. Also don't wake up the softirq thread lest the scheduler access some tracked memory and we go down with a recursive page fault. In this case, we'd better just wait for the maximum time of 1/HZ for the message to appear. Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
-
- 29 4月, 2009 1 次提交
-
-
由 Heiko Carstens 提交于
"tracing: create automated trace defines" causes this compile error on s390, as reported by Sachin Sant against linux-next: kernel/built-in.o: In function `__do_softirq': (.text+0x1c680): undefined reference to `__tracepoint_softirq_entry' This happens because the definitions of the softirq tracepoints were moved from kernel/softirq.c to kernel/irq/handle.c. Since s390 doesn't support generic hardirqs handle.c doesn't get compiled and the definitions are missing. So move the tracepoints to softirq.c again. [ Impact: fix build failure on s390 ] Reported-by: NSachin Sant <sachinp@in.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: fweisbec@gmail.com LKML-Reference: <20090429135139.5fac79b8@osiris.boeblingen.de.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 4月, 2009 1 次提交
-
-
由 Yinghai Lu 提交于
This simplifies the node awareness of the code. All our allocators only deal with a NUMA node ID locality not with CPU ids anyway - so there's no need to maintain (and transform) a CPU id all across the IRq layer. v2: keep move_irq_desc related [ Impact: cleanup, prepare IRQ code to be NUMA-aware ] Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@goop.org> LKML-Reference: <49F65536.2020300@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 17 4月, 2009 1 次提交
-
-
由 H Hartley Sweeten 提交于
Fix sparse warning in kernel/softirq.c. warning: do-while statement is not a compound statement Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com> LKML-Reference: <BD79186B4FD85F4B8E60E381CAEE1909015F9033@mi8nycmail19.Mi8.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 4月, 2009 2 次提交
-
-
由 Steven Rostedt 提交于
Impact: clean up Create a sub directory in include/trace called events to keep the trace point headers in their own separate directory. Only headers that declare trace points should be defined in this directory. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Zhao Lei <zhaolei@cn.fujitsu.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
This patch lowers the number of places a developer must modify to add new tracepoints. The current method to add a new tracepoint into an existing system is to write the trace point macro in the trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or DECLARE_TRACE, then they must add the same named item into the C file with the macro DEFINE_TRACE(name) and then add the trace point. This change cuts out the needing to add the DEFINE_TRACE(name). Every file that uses the tracepoint must still include the trace/<type>.h file, but the one C file must also add a define before the including of that file. #define CREATE_TRACE_POINTS #include <trace/mytrace.h> This will cause the trace/mytrace.h file to also produce the C code necessary to implement the trace point. Note, if more than one trace/<type>.h is used to create the C code it is best to list them all together. #define CREATE_TRACE_POINTS #include <trace/foo.h> #include <trace/bar.h> #include <trace/fido.h> Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with the cleaner solution of the define above the includes over my first design to have the C code include a "special" header. This patch converts sched, irq and lockdep and skb to use this new method. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Zhao Lei <zhaolei@cn.fujitsu.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 31 3月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
It appears I inadvertly introduced rq->lock recursion to the hrtimer_start() path when I delegated running already expired timers to softirq context. This patch fixes it by introducing a __hrtimer_start_range_ns() method that will not use raise_softirq_irqoff() but __raise_softirq_irqoff() which avoids the wakeup. It then also changes schedule() to check for pending softirqs and do the wakeup then, I'm not quite sure I like this last bit, nor am I convinced its really needed. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org LKML-Reference: <20090313112301.096138802@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 3月, 2009 4 次提交
-
-
由 Steven Rostedt 提交于
Impact: clean up It is redundant to have 'SOFTIRQ' in the softirq names. Reported-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
-
由 Jason Baron 提交于
Introduce softirq entry/exit tracepoints. These are useful for augmenting existing tracers, and to figure out softirq frequencies and timings. [ s/irq_softirq_/softirq_/ for trace point names and Fixed printf format in TRACE_FORMAT macro - Steven Rostedt ] LKML-Reference: <20090312183603.GC3352@redhat.com> Signed-off-by: NJason Baron <jbaron@redhat.com> Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
-
由 Jason Baron 提交于
Create a 'softirq_to_name' array, which is indexed by softirq #, so that we can easily convert between the softirq index # and its name, in order to get more meaningful output messages. LKML-Reference: <20090312183336.GB3352@redhat.com> Signed-off-by: NJason Baron <jbaron@redhat.com> Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
-
由 Ingo Molnar 提交于
Impact: cleanup The naming clashes with upcoming softirq tracepoints, so rename the APIs to lockdep_*(). Requested-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 3月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
If a machine is flooded by network frames, a cpu can loop 100% of its time inside ksoftirqd() without calling schedule(). This can delay RCU grace period to insane values. Adding rcu_qsctr_inc() call in ksoftirqd() solves this problem. Paul: "This regression was a result of the recent change from "schedule()" to "cond_resched()", which got rid of that quiescent state in the common case where a reschedule is not needed". Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 25 2月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework the code so that we can use CSD_FLAG_LOCK for both purposes. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@suse.de> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 1月, 2009 1 次提交
-
-
由 Steven Rostedt 提交于
Impact: fix to preempt trace triggering lockdep check_flag failure In local_bh_disable, the use of add_preempt_count causes the preempt tracer to start recording the time preemption is off. But because it already modified the preempt_count to show softirqs disabled, and before it called the lockdep code to handle this, it causes a state that lockdep can not handle. The preempt tracer will reset the ring buffer on start of a trace, and the ring buffer reset code does a spin_lock_irqsave. This calls into lockdep and lockdep will fail when it detects the invalid state of having softirqs disabled but the internal current->softirqs_enabled is still set. The fix is to manually add the SOFTIRQ_OFFSET to preempt count and call the preempt tracer code outside the lockdep critical area. Thanks to Peter Zijlstra for suggesting this solution. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 1月, 2009 1 次提交
-
-
由 Yinghai Lu 提交于
Impact: save RAM with large NR_CPUS, get smaller nr_irqs Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NMike Travis <travis@sgi.com>
-
- 01 1月, 2009 1 次提交
-
-
由 Rusty Russell 提交于
Impact: Remove obsolete API usage any_online_cpu() is a good name, but it takes a cpumask_t, not a pointer. There are several places where any_online_cpu() doesn't really want a mask arg at all. Replace all callers with cpumask_any() and cpumask_any_and(). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NMike Travis <travis@sgi.com>
-
- 29 12月, 2008 1 次提交
-
-
由 Yinghai Lu 提交于
GCC has a bug with __weak alias functions: if the functions are in the same compilation unit as their call site, GCC can decide to inline them - and thus rob the linker of the opportunity to override the weak alias with the real thing. So move all the IRQ handling related __weak symbols to kernel/irq/chip.c. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 12月, 2008 1 次提交
-
-
由 Paul E. McKenney 提交于
This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-