- 06 2月, 2007 1 次提交
-
-
由 Horms 提交于
This moves the ia64 implementation of machine_shutdown() from machine_kexec.c to process.c, which is in keeping with the implelmentation on other architectures, and seems like a much more appropriate home for it. Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 23 12月, 2006 1 次提交
-
-
由 Ingo Molnar 提交于
Fernando Lopez-Lezcano reported frequent scheduling latencies and audio xruns starting at the 2.6.18-rt kernel, and those problems persisted all until current -rt kernels. The latencies were serious and unjustified by system load, often in the milliseconds range. After a patient and heroic multi-month effort of Fernando, where he tested dozens of kernels, tried various configs, boot options, test-patches of mine and provided latency traces of those incidents, the following 'smoking gun' trace was captured by him: _------=> CPU# / _-----=> irqs-off | / _----=> need-resched || / _---=> hardirq/softirq ||| / _--=> preempt-depth |||| / ||||| delay cmd pid ||||| time | caller \ / ||||| \ | / IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (try_to_wake_up) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup <<...>-5856> (37 0) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (c01262ba 0 0) IRQ_19-1479 1D..1 0us : resched_task (try_to_wake_up) IRQ_19-1479 1D..1 0us : __spin_unlock_irqrestore (try_to_wake_up) ... <idle>-0 1...1 11us!: default_idle (cpu_idle) ... <idle>-0 0Dn.1 602us : smp_apic_timer_interrupt (c0103baf 1 0) ... <...>-5856 0D..2 618us : __switch_to (__schedule) <...>-5856 0D..2 618us : __schedule <<idle>-0> (20 162) <...>-5856 0D..2 619us : __spin_unlock_irq (__schedule) <...>-5856 0...1 619us : trace_stop_sched_switched (__schedule) <...>-5856 0D..1 619us : trace_stop_sched_switched <<...>-5856> (37 0) what is visible in this trace is that CPU#1 ran try_to_wake_up() for PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task() for CPU#0. But it decided to not send an IPI that no CPU - due to TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set, and only rescheduled to PID:5856 upon the next lapic timer IRQ. The result was a 600+ usecs latency and a missed wakeup! the bug turned out to be an idle-wakeup bug introduced into the mainline kernel this summer via an optimization in the x86_64 tree: commit 495ab9c0 Author: Andi Kleen <ak@suse.de> Date: Mon Jun 26 13:59:11 2006 +0200 [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status During some profiling I noticed that default_idle causes a lot of memory traffic. I think that is caused by the atomic operations to clear/set the polling flag in thread_info. There is actually no reason to make this atomic - only the idle thread does it to itself, other CPUs only read it. So I moved it into ti->status. the problem is this type of change: if (!hlt_counter && boot_cpu_data.hlt_works_ok) { - clear_thread_flag(TIF_POLLING_NRFLAG); + current_thread_info()->status &= ~TS_POLLING; smp_mb__after_clear_bit(); while (!need_resched()) { local_irq_disable(); this changes clear_thread_flag() to an explicit clearing of TS_POLLING. clear_thread_flag() is defined as: clear_bit(flag, &ti->flags); and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms: static inline void clear_bit(int nr, volatile unsigned long * addr) { __asm__ __volatile__( LOCK_PREFIX "btrl %1,%0" hence smp_mb__after_clear_bit() is defined as a simple compile barrier: #define smp_mb__after_clear_bit() barrier() but the explicit TS_POLLING clearing introduced by the patch: + current_thread_info()->status &= ~TS_POLLING; is not an atomic op! So the clearing of the TS_POLLING bit is freely reorderable with the reading of the NEED_RESCHED bit - and both now reside in different memory addresses. CPU idle wakeup very much depends on ordered memory ops, the clearing of the TS_POLLING flag must always be done before we test need_resched() and hit the idle instruction(s). [Symmetrically, the wakeup code needs to set NEED_RESCHED before it tests the TS_POLLING flag, so memory ordering is paramount.] Fernando's dual-core Athlon64 system has a sufficiently advanced memory ordering model so that it triggered this scenario very often. ( And it also turned out that the reason why these latencies never triggered on my testsystems is that i routinely use idle=poll, which was the only idle variant not affected by this bug. ) The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to act as an absolute barrier between the TS_POLLING write and the NEED_RESCHED read. This affects almost all idling methods (default, ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64. Signed-off-by: NIngo Molnar <mingo@elte.hu> Tested-by: NFernando Lopez-Lezcano <nando@ccrma.Stanford.EDU> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 02 10月, 2006 1 次提交
-
-
由 Arnd Bergmann 提交于
The last in-kernel user of errno is gone, so we should remove the definition and everything referring to it. This also removes the now-unused lib/execve.c file that was introduced earlier. Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the kernel. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 01 7月, 2006 1 次提交
-
-
由 Jörn Engel 提交于
Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
- 27 6月, 2006 1 次提交
-
-
由 Andi Kleen 提交于
During some profiling I noticed that default_idle causes a lot of memory traffic. I think that is caused by the atomic operations to clear/set the polling flag in thread_info. There is actually no reason to make this atomic - only the idle thread does it to itself, other CPUs only read it. So I moved it into ti->status. Converted i386/x86-64/ia64 for now because that was the easiest way to fix ACPI which also manipulates these flags in its idle function. Cc: Nick Piggin <npiggin@novell.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Len Brown <len.brown@intel.com> Signed-off-by: NAndi Kleen <ak@suse.de> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 27 3月, 2006 1 次提交
-
-
由 bibo mao 提交于
When kretprobe probes the schedule() function, if the probed process exits then schedule() will never return, so some kretprobe instances will never be recycled. In this patch the parent process will recycle retprobe instances of the probed function and there will be no memory leak of kretprobe instances. Signed-off-by: Nbibo mao <bibo.mao@intel.com> Cc: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 13 1月, 2006 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 15 12月, 2005 1 次提交
-
-
由 Robin Holt 提交于
Missed this when fixing the SET_PERSONALITY change. Signed-off-by: NRobin Holt <holt@sgi.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 07 12月, 2005 1 次提交
-
-
由 Robin Holt 提交于
We have a customer application which trips a bug. The problem arises when a driver attempts to call do_munmap on an area which is mapped, but because current->thread.task_size has been set to 0xC0000000, the call to do_munmap fails thinking it is an unmap beyond the user's address space. The comment in fs/binfmt_elf.c in load_elf_library() before the call to SET_PERSONALITY() indicates that task_size must not be changed for the running application until flush_thread, but is for ia64 executing ia32 binaries. This patch moves the setting of task_size from SET_PERSONALITY() to flush_thread() as indicated. The customer application no longer is able to trip the bug. Signed-off-by: NRobin Holt <holt@sgi.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 24 11月, 2005 1 次提交
-
-
由 Jim Keniston 提交于
Fix a bug in kprobes that can cause an Oops or even a crash when a return probe is installed on one of the following functions: sys_execve, do_execve, load_*_binary, flush_old_exec, or flush_thread. The fix is to remove the call to kprobe_flush_task() in flush_thread(). This fix has been tested on all architectures for which the return-probes feature has been implemented (i386, x86_64, ppc64, ia64). Please apply. BACKGROUND Up to now, we have called kprobe_flush_task() under two situations: when a task exits, and when it execs. Flushing kretprobe_instances on exit is correct because (a) do_exit() doesn't return, and (b) one or more return-probed functions may be active when a task calls do_exit(). Neither is the case for sys_execve() and its callees. Initially, the mistaken call to kprobe_flush_task() on exec was harmless because we put the "real" return address of each active probed function back in the stack, just to be safe, when we recycled its kretprobe_instance. When support for ppc64 and ia64 was added, this safety measure couldn't be employed, and was eventually dropped even for i386 and x86_64. sys_execve() and its callees were informally blacklisted for return probes until this fix was developed. Acked-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: NJim Keniston <jkenisto@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 16 11月, 2005 1 次提交
-
-
由 Chen, Kenneth W 提交于
Our performance validation on 2.6.15-rc1 caught a disastrous performance regression on ia64 with netperf (-98%) and volanomark (-58%) compares to previous kernel version 2.6.14-git7. See the following chart (result group 1 & 2). http://kernel-perf.sourceforge.net/results.machine_id=26.html We have root caused it to commit 64c7c8f8 This changeset broke the ia64 task resched notification. In sched.c:resched_task(), a reschedule IPI is conditioned upon TIF_POLLING_NRFLAG. However, the above changeset unconditionally set the polling thread flag for idle tasks regardless whether pal_halt_light is in use or not. As a result, resched IPI is not sent from resched_task(). And since the default behavior on ia64 is to use pal_halt_light, we end up delaying the rescheduling task until next timer tick, and thus cause the performance regression. This fixes the performance bug. I'm glad our performance suite is turning up bad performance bug like this in time. Signed-off-by: NKen Chen <kenneth.w.chen@intel.com> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 09 11月, 2005 2 次提交
-
-
由 Nick Piggin 提交于
Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce confusion, and make their semantics rigid. Improves efficiency of resched_task and some cpu_idle routines. * In resched_task: - TIF_NEED_RESCHED is only cleared with the task's runqueue lock held, and as we hold it during resched_task, then there is no need for an atomic test and set there. The only other time this should be set is when the task's quantum expires, in the timer interrupt - this is protected against because the rq lock is irq-safe. - If TIF_NEED_RESCHED is set, then we don't need to do anything. It won't get unset until the task get's schedule()d off. - If we are running on the same CPU as the task we resched, then set TIF_NEED_RESCHED and no further action is required. - If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set after TIF_NEED_RESCHED has been set, then we need to send an IPI. Using these rules, we are able to remove the test and set operation in resched_task, and make clear the previously vague semantics of POLLING_NRFLAG. * In idle routines: - Enter cpu_idle with preempt disabled. When the need_resched() condition becomes true, explicitly call schedule(). This makes things a bit clearer (IMO), but haven't updated all architectures yet. - Many do a test and clear of TIF_NEED_RESCHED for some reason. According to the resched_task rules, this isn't needed (and actually breaks the assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock held). So remove that. Generally one less locked memory op when switching to the idle thread. - Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner most polling idle loops. The above resched_task semantics allow it to be set until before the last time need_resched() is checked before going into a halt requiring interrupt wakeup. Many idle routines simply never enter such a halt, and so POLLING_NRFLAG can be always left set, completely eliminating resched IPIs when rescheduling the idle task. POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs. Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Con Kolivas <kernel@kolivas.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Nick Piggin 提交于
Run idle threads with preempt disabled. Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()). How did it ever work before? Might fix the CPU hotplugging hang which Nigel Cunningham noted. We think the bug hits if the idle thread is preempted after checking need_resched() and before going to sleep, then the CPU offlined. After calling stop_machine_run, the CPU eventually returns from preemption and into the idle thread and goes to sleep. The CPU will continue executing previous idle and have no chance to call play_dead. By disabling preemption until we are ready to explicitly schedule, this bug is fixed and the idle threads generally become more robust. From: alexs <ashepard@u.washington.edu> PPC build fix From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> MIPS build fix Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NYoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 08 11月, 2005 1 次提交
-
-
由 Keith Owens 提交于
notify_die() added for MCA_{MONARCH,SLAVE,RENDEZVOUS}_{ENTER,PROCESS,LEAVE} and INIT_{MONARCH,SLAVE}_{ENTER,PROCESS,LEAVE}. We need multiple notification points for these events because they can take many seconds to run which has nasty effects on the behaviour of the rest of the system. DIE_SS replaced by a generic DIE_FAULT which checks the vector number, to allow interception of faults other than SS. DIE_MACHINE_{HALT,RESTART} added to allow last minute close down processing, especially when the halt/restart routines are called from error handlers. DIE_OOPS added. The check for kprobe's break numbers has been moved from traps.c to kprobes.c, allowing DIE_BREAK to be used for any additional break numbers, i.e. it is no longer kprobes specific. Hooks for kernel debuggers and kernel dumpers added, ENTER and LEAVE. Both of these disable the system for long periods which impact on watchdogs and heartbeat systems in general. More patches to come that use these events to reset watchdogs and heartbeats. unregister_die_notifier() added and both routines exported. Requested by Dean Nelson. Lock removed from {un,}register_die_notifier. notifier_chain_register() already takes a lock. Also the generic notifier chain locking is being reworked to distinguish between callbacks that can block and those that cannot, the lock in {un,}register_die_notifier would interfere with that change. http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2 Leading white space removed from arch/ia64/kernel/kprobes.c. Typo in mca.c in original version of this patch found & fixed by Dean Nelson. Signed-off-by: NKeith Owens <kaos@sgi.com> Acked-by: NDean Nelson <dcn@sgi.com> Acked-by: NAnil Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 09 8月, 2005 1 次提交
-
-
由 Ken Chen 提交于
this changeset broke the "nohalt" kernel boot option. 8df5a500 default_idle() is looking at new variable can_do_pal_halt. However, that variable did not get cleared upon "nohalt" boot option. Result is that "nohalt" option is ignored until perfmon is exercised. Signed-off-by: NKen Chen <kenneth.w.chen@intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 27 7月, 2005 1 次提交
-
-
由 Eric W. Biederman 提交于
machine_restart, machine_halt and machine_power_off are machine specific hooks deep into the reboot logic, that modules have no business messing with. Usually code should be calling kernel_restart, kernel_halt, kernel_power_off, or emergency_restart. So don't export machine_restart, machine_halt, and machine_power_off so we can catch buggy users. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 12 7月, 2005 1 次提交
-
-
由 Venkatesh Pallipadi 提交于
http://bugzilla.kernel.org/show_bug.cgi?id=4233Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
- 28 6月, 2005 1 次提交
-
-
由 Rusty Lynch 提交于
The following patch implements function return probes for ia64 using the revised design. With this new design we no longer need to do some of the odd hacks previous required on the last ia64 return probe port that I sent out for comments. Note that this new implementation still does not resolve the problem noted by Keith Owens where backtrace data is lost after a return probe is hit. Changes include: * Addition of kretprobe_trampoline to act as a dummy function for instrumented functions to return to, and for the return probe infrastructure to place a kprobe on on, gaining control so that the return probe handler can be called, and so that the instruction pointer can be moved back to the original return address. * Addition of arch_init(), allowing a kprobe to be registered on kretprobe_trampoline * Addition of trampoline_probe_handler() which is used as the pre_handler for the kprobe inserted on kretprobe_implementation. This is the function that handles the details for calling the return probe handler function and returning control back at the original return address * Addition of arch_prepare_kretprobe() which is setup as the pre_handler for a kprobe registered at the beginning of the target function by kernel/kprobes.c so that a return probe instance can be setup when a caller enters the target function. (A return probe instance contains all the needed information for trampoline_probe_handler to do it's job.) * Hooks added to the exit path of a task so that we can cleanup any left-over return probe instances (i.e. if a task dies while inside a targeted function then the return probe instance was reserved at the beginning of the function but the function never returns so we need to mark the instance as unused.) Signed-off-by: NRusty Lynch <rusty.lynch@intel.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 04 5月, 2005 3 次提交
-
-
由 Tony Luck 提交于
Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Stephane Eranian 提交于
The pmu_active test is based on the values of PSR.up. THIS IS THE PROBLEM as it does not take into account the lazy restore logic which is as follow (simplified): context switch out: save PMDs clear psr.up release ownership context switch in: if (ctx->last_cpu == smp_processor_id() && ctx->cpu_activation == cpu_activation) { set psr.up return } restore PMD restore PMC ctx->last_cpu = smp_processor_id(); ctx->activation = ++cpu_activation; set psr.up The key here is that on context switch out, we clear psr.up and on context switch in we check if nobody else used the PMU on that processor since last time we came. In that case, we assume the PMD/PMC are ours and we simply reactivate. The Caliper problem is that between the moment we context switch out and the moment we come back, nobody effectively used the PMU BUT the processor went idle. Normally this would have no incidence but PAL_HALT does alter the PMU registers. In default_idle(), the test on psr.up is not strong enough to cover this case and we go into PAL which trashed the PMU resgisters. When we come back we falsely assume that this is our state yet it is corrupted. Very nasty indeed. To avoid the problem it is necessary to forbid going to PAL_HALT as soon as perfmon installs some valid state in the PMU registers. This happens with an application attaches a context to a thread or CPU. It is not enough to check the psr/dcr bits. Hence I propose the attached patch. It adds a callback in process.c to modify the condition to enter PAL on idle. Basically, now it is conditional to pal_halt=1 AND perfmon saying it is okay. Signed-off-by: NTony Luck <tony.luck@intel.com>
-
由 Zwane Mwaikambo 提交于
Andi noted that during normal runtime cpu_idle_map is bounced around a lot, and occassionally at a higher frequency than the timer interrupt wakeup which we normally exit pm_idle from. So switch to a percpu variable. I didn't move things to the slow path because it would involve adding scheduler code to wakeup the idle thread on the cpus we're waiting for. Signed-off-by: NZwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 23 4月, 2005 1 次提交
-
-
由 Ashok Raj 提交于
This patch is required to support cpu removal for IPF systems. Existing code just fakes the real offline by keeping it run the idle thread, and polling for the bit to re-appear in the cpu_state to get out of the idle loop. For the cpu-offline to work correctly, we need to pass control of this CPU back to SAL so it can continue in the boot-rendez mode. This gives the SAL control to not pick this cpu as the monarch processor for global MCA events, and addition does not wait for this cpu to checkin with SAL for global MCA events as well. The handoff is implemented as documented in SAL specification section 3.2.5.1 "OS_BOOT_RENDEZ to SAL return State" Signed-off-by: NAshok Raj <ashok.raj@intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com>
-
- 17 4月, 2005 1 次提交
-
-
由 Linus Torvalds 提交于
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
-