- 28 6月, 2006 1 次提交
-
-
由 David Wilder 提交于
With this patch, kdump uses the firmware soft-reset NMI for two purposes: 1) Initiate the kdump (take a crash dump) by issuing a soft-reset. 2) Break a CPU out of a deadlock condition that is detected during kdump processing. When a soft-reset is initiated each CPU will enter system_reset_exception() and set its corresponding bit in the global bit-array cpus_in_sr then call die(). When die() finds the CPU's bit set in cpu_in_sr crash_kexec() is called to initiate a crash dump. The first CPU to enter crash_kexec() is called the "crashing CPU". All other CPUs are "secondary CPUs". The secondary CPU's pass through to crash_kexec_secondary() and sleep. The crashing CPU waits for all CPUs to enter via soft-reset then boots the kdump kernel (see crash_soft_reset_check()) When the system crashes due to a panic or exception, crash_kexec() is called by panic() or die(). The crashing CPU sends an IPI to all other CPUs to notify them of the pending shutdown. If a CPU is in a deadlock or hung state with interrupts disabled, the IPI will not be delivered. The result being, that the kdump kernel is not booted. This problem is solved with the use of a firmware generated soft-reset. After the crashing_cpu has issued the IPI, it waits for 10 sec for all CPUs to enter crash_ipi_callback(). A CPU signifies its entry to crash_ipi_callback() by setting its corresponding bit in the cpus_in_crash bit array. After 10 sec, if one or more CPUs have not set their bit in cpus_in_crash we assume that the CPU(s) is deadlocked. The operator is then prompted to generate a soft-reset to break the deadlock. Each CPU enters the soft reset handler as described above. Two conditions must be handled at this point: 1) The system crashed because the operator generated a soft-reset. See 2) The system had crashed before the soft-reset was generated ( in the case of a Panic or oops). The first CPU to enter crash_kexec() uses the state of the kexec_lock to determine this state. If kexec_lock is already held then condition 2 is true and crash_kexec_secondary() is called, else; this CPU is flagged as the crashing CPU, the kexec_lock is acquired and crash_kexec() proceeds as described above. Each additional CPUs responding to the soft-reset will pass through crash_kexec() to kexec_secondary(). All secondary CPUs call crash_ipi_callback() readying them self's for the shutdown. When ready they clear their bit in cpus_in_sr. The crashing CPU waits in kexec_secondary() until all other CPUs have cleared their bits in cpus_in_sr. The kexec kernel boot is then started. Signed-off-by: NHaren Myneni <haren@us.ibm.com> Signed-off-by: NDavid Wilder <dwilder@us.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 27 6月, 2006 1 次提交
-
-
由 Lee Revell 提交于
In a testament to the utter simplicity and logic of the English language ;-), I found a single correct use - in kernel/panic.c - and 10-15 incorrect ones. Signed-Off-By: NLee Revell <rlrevell@joe-job.com> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
- 22 4月, 2006 1 次提交
-
-
由 Michael Ellerman 提交于
We've seen several bugs caused by interrupt weirdness in the kdump kernel. Panicking from an interrupt handler means we fail to EOI the interrupt, and so the second kernel never gets that interrupt ever again. We also see hangs on JS20 where we take interrupts in the second kernel early during boot. This patch fixes both those problems, and although it adds more code to the crash path I think it is the best solution. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 24 2月, 2006 1 次提交
-
-
由 Haren Myneni 提交于
The panic CPU is waiting forever due to some large timeout value if some CPU is not responding to an IPI. This patch fixes the problem - the maximum waiting period will be 10 seconds and then the kdump boot will go ahead. Signed-off-by: NHaren Myneni <haren@us.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 15 1月, 2006 1 次提交
-
-
由 Haren Myneni 提交于
- This contains the arch specific changes for the following the kdump generic fixes which were already accepted in the upstream. . Capturing CPU registers (for the case of 'panic' and invoking the dump using 'sysrq-trigger') from a function (stack frame) which will be not be available during the kdump boot. Hence, might result in invalid stack trace. . Dynamically allocating per cpu ELF notes section instead of statically for NR_CPUS. - Fix the compiler warning in prom_init.c. Signed-off-by: NHaren Myneni <haren@us.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 11 1月, 2006 1 次提交
-
-
由 Haren Myneni 提交于
This patch fixes the compilation error (shown below) when CONFIG_SMP=n. arch/powerpc/kernel/crash.c: In function `crash_kexec_prepare_cpus': arch/powerpc/kernel/crash.c:236: error: implicit declaration of function `smp_release_cpus' Signed-off-by: NHaren Myneni <haren@us.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 09 1月, 2006 1 次提交
-
-
由 Michael Ellerman 提交于
Implementing the machine_crash_shutdown which will be called by crash_kexec (called in case of a panic, sysrq etc.). Disable the interrupts, shootdown cpus using debugger IPI and collect regs for all CPUs. elfcorehdr= specifies the location of elf core header stored by the crashed kernel. This command line option will be passed by the kexec-tools to capture kernel. savemaxmem= specifies the actual memory size that the first kernel has and this value will be used for dumping in the capture kernel. This command line option will be passed by the kexec-tools to capture kernel. Signed-off-by: NHaren Myneni <haren@us.ibm.com> Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-