- 05 8月, 2014 1 次提交
-
-
由 Paul Mackerras 提交于
Some people see things like "Exception: 501" in stack traces in dmesg and assume that means that something has gone badly wrong, when in fact "Exception: 501" just means a device interrupt was taken. This changes "Exception" to "interrupt" to make it clearer that we are just recording the fact of a change in control flow rather than some error condition. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 28 7月, 2014 2 次提交
-
-
由 Michael Ellerman 提交于
The previous patch left a bit of a wart in copy_process(). Clean it up a bit by moving the logic out into a helper. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Michael Ellerman 提交于
We now only support cpus that use an SLB, so we don't need an MMU feature to indicate that. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 11 6月, 2014 1 次提交
-
-
由 Sam bobroff 提交于
Correct the DSCR SPR becoming temporarily corrupted if a task is context switched during a transaction. The problem occurs while suspending the task and is caused by saving the DSCR to thread.dscr after it has already been set to the CPU's default value: __switch_to() calls __switch_to_tm() which calls tm_reclaim_task() which calls tm_reclaim_thread() which calls tm_reclaim() where the DSCR is set to the CPU's default __switch_to() calls _switch() where thread.dscr is set to the DSCR When the task is resumed, it's transaction will be doomed (as usual) and the DSCR SPR will be corrupted, although the checkpointed value will be correct. Therefore the DSCR will be immediately corrected by the transaction aborting, unless it has been suspended. In that case the incorrect value can be seen by the task until it resumes the transaction. The fix is to treat the DSCR similarly to the TAR and save it early in __switch_to(). A program exposing the problem is added to the kernel self tests as: tools/testing/selftests/powerpc/tm/tm-resched-dscr. Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com> CC: <stable@vger.kernel.org> [v3.10+] Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 20 5月, 2014 2 次提交
-
-
由 Paul Gortmaker 提交于
Currently, on 8641D, which doesn't set CONFIG_HAVE_HW_BREAKPOINT we get the following splat: BUG: using smp_processor_id() in preemptible [00000000] code: login/1382 caller is set_breakpoint+0x1c/0xa0 CPU: 0 PID: 1382 Comm: login Not tainted 3.15.0-rc3-00041-g2aafe1a4 #1 Call Trace: [decd5d80] [c0008dc4] show_stack+0x50/0x158 (unreliable) [decd5dc0] [c03c6fa0] dump_stack+0x7c/0xdc [decd5de0] [c01f8818] check_preemption_disabled+0xf4/0x104 [decd5e00] [c00086b8] set_breakpoint+0x1c/0xa0 [decd5e10] [c00d4530] flush_old_exec+0x2bc/0x588 [decd5e40] [c011c468] load_elf_binary+0x2ac/0x1164 [decd5ec0] [c00d35f8] search_binary_handler+0xc4/0x1f8 [decd5ef0] [c00d4ee8] do_execve+0x3d8/0x4b8 [decd5f40] [c001185c] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xfeee554 LR = 0xfeee7d4 The call path in this case is: flush_thread --> set_debug_reg_defaults --> set_breakpoint --> __get_cpu_var Since preemption is enabled in the cleanup of flush thread, and there is no need to disable it, introduce the distinction between set_breakpoint and __set_breakpoint, leaving only the flush_thread instance as the current user of set_breakpoint. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Paul Gortmaker 提交于
None of the callers check the return value, so it might as well not have one at all. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 23 4月, 2014 1 次提交
-
-
由 Anton Blanchard 提交于
Change how we setup registers for ret_from_kernel_thread. In ABIv1, instead of passing a function descriptor in, dereference it and pass the target in directly. Use ppc_global_function_entry to get it right on both ABIv1 and ABIv2. Signed-off-by: NAnton Blanchard <anton@samba.org>
-
- 07 4月, 2014 1 次提交
-
-
由 Michael Neuling 提交于
We can't take an IRQ when we're about to do a trechkpt as our GPR state is set to user GPR values. We've hit this when running some IBM Java stress tests in the lab resulting in the following dump: cpu 0x3f: Vector: 700 (Program Check) at [c000000007eb3d40] pc: c000000000050074: restore_gprs+0xc0/0x148 lr: 00000000b52a8184 sp: ac57d360 msr: 8000000100201030 current = 0xc00000002c500000 paca = 0xc000000007dbfc00 softe: 0 irq_happened: 0x00 pid = 34535, comm = Pooled Thread # R00 = 00000000b52a8184 R16 = 00000000b3e48fda R01 = 00000000ac57d360 R17 = 00000000ade79bd8 R02 = 00000000ac586930 R18 = 000000000fac9bcc R03 = 00000000ade60000 R19 = 00000000ac57f930 R04 = 00000000f6624918 R20 = 00000000ade79be8 R05 = 00000000f663f238 R21 = 00000000ac218a54 R06 = 0000000000000002 R22 = 000000000f956280 R07 = 0000000000000008 R23 = 000000000000007e R08 = 000000000000000a R24 = 000000000000000c R09 = 00000000b6e69160 R25 = 00000000b424cf00 R10 = 0000000000000181 R26 = 00000000f66256d4 R11 = 000000000f365ec0 R27 = 00000000b6fdcdd0 R12 = 00000000f66400f0 R28 = 0000000000000001 R13 = 00000000ada71900 R29 = 00000000ade5a300 R14 = 00000000ac2185a8 R30 = 00000000f663f238 R15 = 0000000000000004 R31 = 00000000f6624918 pc = c000000000050074 restore_gprs+0xc0/0x148 cfar= c00000000004fe28 dont_restore_vec+0x1c/0x1a4 lr = 00000000b52a8184 msr = 8000000100201030 cr = 24804888 ctr = 0000000000000000 xer = 0000000000000000 trap = 700 This moves tm_recheckpoint to a C function and moves the tm_restore_sprs into that function. It then adds IRQ disabling over the trechkpt critical section. It also sets the TEXASR FS in the signals code to ensure this is never set now that we explictly write the TM sprs in tm_recheckpoint. Signed-off-by: NMichael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 07 3月, 2014 1 次提交
-
-
由 Michael Neuling 提交于
When we fork/clone we currently don't copy any of the TM state to the new thread. This results in a TM bad thing (program check) when the new process is switched in as the kernel does a tmrechkpt with TEXASR FS not set. Also, since R1 is from userspace, we trigger the bad kernel stack pointer detection. So we end up with something like this: Bad kernel stack pointer 0 at c0000000000404fc cpu 0x2: Vector: 700 (Program Check) at [c00000003ffefd40] pc: c0000000000404fc: restore_gprs+0xc0/0x148 lr: 0000000000000000 sp: 0 msr: 9000000100201030 current = 0xc000001dd1417c30 paca = 0xc00000000fe00800 softe: 0 irq_happened: 0x01 pid = 0, comm = swapper/2 WARNING: exception is not recoverable, can't continue The below fixes this by flushing the TM state before we copy the task_struct to the clone. To do this we go through the tmreclaim patch, which removes the checkpointed registers from the CPU and transitions the CPU out of TM suspend mode. Hence we need to call tmrechkpt after to restore the checkpointed state and the TM mode for the current task. To make this fail from userspace is simply: tbegin li r0, 2 sc <boom> Kudos to Adhemerval Zanella Neto for finding this. Signed-off-by: NMichael Neuling <mikey@neuling.org> cc: Adhemerval Zanella Neto <azanella@br.ibm.com> cc: stable@vger.kernel.org Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 29 1月, 2014 1 次提交
-
-
由 Andreas Schwab 提交于
This fixes a logic error that caused a failure to update the hw breakpoint registers when not using the hw-breakpoint interface. Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 15 1月, 2014 2 次提交
-
-
由 Paul Mackerras 提交于
Currently, when we have a process using the transactional memory facilities on POWER8 (that is, the processor is in transactional or suspended state), and the process enters the kernel and the kernel then uses the floating-point or vector (VMX/Altivec) facility, we end up corrupting the user-visible FP/VMX/VSX state. This happens, for example, if a page fault causes a copy-on-write operation, because the copy_page function will use VMX to do the copy on POWER8. The test program below demonstrates the bug. The bug happens because when FP/VMX state for a transactional process is stored in the thread_struct, we store the checkpointed state in .fp_state/.vr_state and the transactional (current) state in .transact_fp/.transact_vr. However, when the kernel wants to use FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(), which saves the current state in .fp_state/.vr_state. Furthermore, when we return to the user process we return with FP/VMX/VSX disabled. The next time the process uses FP/VMX/VSX, we don't know which set of state (the current register values, .fp_state/.vr_state, or .transact_fp/.transact_vr) we should be using, since we have no way to tell if we are still in the same transaction, and if not, whether the previous transaction succeeded or failed. Thus it is necessary to strictly adhere to the rule that if FP has been enabled at any point in a transaction, we must keep FP enabled for the user process with the current transactional state in the FP registers, until we detect that it is no longer in a transaction. Similarly for VMX; once enabled it must stay enabled until the process is no longer transactional. In order to keep this rule, we add a new thread_info flag which we test when returning from the kernel to userspace, called TIF_RESTORE_TM. This flag indicates that there is FP/VMX/VSX state to be restored before entering userspace, and when it is set the .tm_orig_msr field in the thread_struct indicates what state needs to be restored. The restoration is done by restore_tm_state(). The TIF_RESTORE_TM bit is set by new giveup_fpu/altivec_maybe_transactional helpers, which are called from enable_kernel_fp/altivec, giveup_vsx, and flush_fp/altivec_to_thread instead of giveup_fpu/altivec. The other thing to be done is to get the transactional FP/VMX/VSX state from .fp_state/.vr_state when doing reclaim, if that state has been saved there by giveup_fpu/altivec_maybe_transactional. Having done this, we set the FP/VMX bit in the thread's MSR after reclaim to indicate that that part of the state is now valid (having been reclaimed from the processor's checkpointed state). Finally, in the signal handling code, we move the clearing of the transactional state bits in the thread's MSR a bit earlier, before calling flush_fp_to_thread(), so that we don't unnecessarily set the TIF_RESTORE_TM bit. This is the test program: /* Michael Neuling 4/12/2013 * * See if the altivec state is leaked out of an aborted transaction due to * kernel vmx copy loops. * * gcc -m64 htm_vmxcopy.c -o htm_vmxcopy * */ /* We don't use all of these, but for reference: */ int main(int argc, char *argv[]) { long double vecin = 1.3; long double vecout; unsigned long pgsize = getpagesize(); int i; int fd; int size = pgsize*16; char tmpfile[] = "/tmp/page_faultXXXXXX"; char buf[pgsize]; char *a; uint64_t aborted = 0; fd = mkstemp(tmpfile); assert(fd >= 0); memset(buf, 0, pgsize); for (i = 0; i < size; i += pgsize) assert(write(fd, buf, pgsize) == pgsize); unlink(tmpfile); a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); assert(a != MAP_FAILED); asm __volatile__( "lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value TBEGIN "beq 3f ;" TSUSPEND "xxlxor 40,40,40 ; " // set 40 to 0 "std 5, 0(%[map]) ;" // cause kernel vmx copy page TABORT TRESUME TEND "li %[res], 0 ;" "b 5f ;" "3: ;" // Abort handler "li %[res], 1 ;" "5: ;" "stxvd2x 40,0,%[vecoutptr] ; " : [res]"=r"(aborted) : [vecinptr]"r"(&vecin), [vecoutptr]"r"(&vecout), [map]"r"(a) : "memory", "r0", "r3", "r4", "r5", "r6", "r7"); if (aborted && (vecin != vecout)){ printf("FAILED: vector state leaked on abort %f != %f\n", (double)vecin, (double)vecout); exit(1); } munmap(a, size); close(fd); printf("PASSED!\n"); return 0; } Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Paul Gortmaker 提交于
None of these files are actually using any __init type directives and hence don't need to include <linux/init.h>. Most are just a left over from __devinit and __cpuinit removal, or simply due to code getting copied from one driver to the next. The one instance where we add an include for init.h covers off a case where that file was implicitly getting it from another header which itself didn't need it. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 08 1月, 2014 1 次提交
-
-
由 Joseph Myers 提交于
The e500 SPE floating-point emulation code clears existing exceptions (__FPU_FPSCR &= ~FP_EX_MASK;) before ORing in the exceptions from the emulated operation. However, these exception bits are the "sticky", cumulative exception bits, and should only be cleared by the user program setting SPEFSCR, not implicitly by any floating-point instruction (whether executed purely by the hardware or emulated). The spurious clearing of these bits shows up as missing exceptions in glibc testing. Fixing this, however, is not as simple as just not clearing the bits, because while the bits may be from previous floating-point operations (in which case they should not be cleared), the processor can also set the sticky bits itself before the interrupt for an exception occurs, and this can happen in cases when IEEE 754 semantics are that the sticky bit should not be set. Specifically, the "invalid" sticky bit is set in various cases with non-finite operands, where IEEE 754 semantics do not involve raising such an exception, and the "underflow" sticky bit is set in cases of exact underflow, whereas IEEE 754 semantics are that this flag is set only for inexact underflow. Thus, for correct emulation the kernel needs to know the setting of these two sticky bits before the instruction being emulated. When a floating-point operation raises an exception, the kernel can note the state of the sticky bits immediately afterwards. Some <fenv.h> functions that affect the state of these bits, such as fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and PR_SET_FPEXC anyway, and so it is natural to record the state of those bits during that call into the kernel and so avoid any need for a separate call into the kernel to inform it of a change to those bits. Thus, the interface I chose to use (in this patch and the glibc port) is that one of those prctl calls must be made after any userspace change to those sticky bits, other than through a floating-point operation that traps into the kernel anyway. feclearexcept and fesetexceptflag duly make those calls, which would not be required were it not for this issue. The previous EGLIBC port, and the uClibc code copied from it, is fundamentally broken as regards any use of prctl for floating-point exceptions because it didn't use the PR_FP_EXC_SW_ENABLE bit in its prctl calls (and did various worse things, such as passing a pointer when prctl expected an integer). If you avoid anything where prctl is used, the clearing of sticky bits still means it will never give anything approximating correct exception semantics with existing kernels. I don't believe the patch makes things any worse for existing code that doesn't try to inform the kernel of changes to sticky bits - such code may get incorrect exceptions in some cases, but it would have done so anyway in other cases. Signed-off-by: NJoseph Myers <joseph@codesourcery.com> Signed-off-by: NScott Wood <scottwood@freescale.com>
-
- 11 12月, 2013 1 次提交
-
-
由 Scott Wood 提交于
Commit ce11e48b ("KVM: PPC: E500: Add userspace debug stub support") added "struct thread_struct" to the stack of kvmppc_vcpu_run(). thread_struct is 1152 bytes on my build, compared to 48 bytes for the recently-introduced "struct debug_reg". Use the latter instead. This fixes the following error: cc1: warnings being treated as errors arch/powerpc/kvm/booke.c: In function 'kvmppc_vcpu_run': arch/powerpc/kvm/booke.c:760:1: error: the frame size of 1424 bytes is larger than 1024 bytes make[2]: *** [arch/powerpc/kvm/booke.o] Error 1 make[1]: *** [arch/powerpc/kvm] Error 2 make[1]: *** Waiting for unfinished jobs.... Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 21 11月, 2013 4 次提交
-
-
由 Anton Blanchard 提交于
If TM is not active there is no need to print PACATMSCRATCH so we can save ourselves a line. Signed-off-by: NAnton Blanchard <anton@samba.org> Acked-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
We waste quite a few lines in our oops output: ... MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 28044024 XER: 00000000 SOFTE: 0 CFAR: 0000000000009088 DAR: 000000000000001c, DSISR: 40000000 GPR00: c0000000000c74f0 c00000037cc1b010 c000000000d2bb30 0000000000000000 ... We can do a better job here and remove 3 lines: MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 28044024 XER: 00000000 CFAR: 0000000000009088 DAR: 0000000000000010, DSISR: 40000000 SOFTE: 1 GPR00: c0000000000e3d10 c00000037cc2fda0 c000000000d2c3a8 0000000000000001 Also move PACATMSCRATCH up, it doesn't really belong in the stack trace section. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Anton Blanchard 提交于
Machine check exceptions set DAR and DSISR, so print them in our oops output. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Rusty Russell 提交于
No function descriptor, but we set r12 up and set TIF_RESTOREALL as it normally isn't restored on return from syscall. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 31 10月, 2013 1 次提交
-
-
由 Michael Neuling 提交于
We currently turn IRQs off in __switch_to(0 but this is unnecessary as it's already disabled in the caller. This removes the IRQ disable but adds a check to make sure it is really off in case this changes in future. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 19 10月, 2013 3 次提交
-
-
由 Bharat Bhushan 提交于
KVM need this function when switching from vcpu to user-space thread. My subsequent patch will use this function. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Acked-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NScott Wood <scottwood@freescale.com>
-
由 Bharat Bhushan 提交于
This way we can use same data type struct with KVM and also help in using other debug related function. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Acked-by: NMichael Neuling <mikey@neuling.org> [scottwood@freescale.com: removed obvious debug_reg comment] Signed-off-by: NScott Wood <scottwood@freescale.com>
-
由 Bharat Bhushan 提交于
Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Acked-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NScott Wood <scottwood@freescale.com>
-
- 17 10月, 2013 3 次提交
-
-
由 Bharat Bhushan 提交于
KVM need this function when switching from vcpu to user-space thread. My subsequent patch will use this function. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Acked-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Bharat Bhushan 提交于
This way we can use same data type struct with KVM and also help in using other debug related function. Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Bharat Bhushan 提交于
Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 11 10月, 2013 2 次提交
-
-
由 Paul Mackerras 提交于
This provides a facility which is intended for use by KVM, where the contents of the FP/VSX and VMX (Altivec) registers can be saved away to somewhere other than the thread_struct when kernel code wants to use floating point or VMX instructions. This is done by providing a pointer in the thread_struct to indicate where the state should be saved to. The giveup_fpu() and giveup_altivec() functions test these pointers and save state to the indicated location if they are non-NULL. Note that the MSR_FP/VEC bits in task->thread.regs->msr are still used to indicate whether the CPU register state is live, even when an alternate save location is being used. This also provides load_fp_state() and load_vr_state() functions, which load up FP/VSX and VMX state from memory into the CPU registers, and corresponding store_fp_state() and store_vr_state() functions, which store FP/VSX and VMX state into memory from the CPU registers. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Paul Mackerras 提交于
This creates new 'thread_fp_state' and 'thread_vr_state' structures to store FP/VSX state (including FPSCR) and Altivec/VSX state (including VSCR), and uses them in the thread_struct. In the thread_fp_state, the FPRs and VSRs are represented as u64 rather than double, since we rarely perform floating-point computations on the values, and this will enable the structures to be used in KVM code as well. Similarly FPSCR is now a u64 rather than a structure of two 32-bit values. This takes the offsets out of the macros such as SAVE_32FPRS, REST_32FPRS, etc. This enables the same macros to be used for normal and transactional state, enabling us to delete the transactional versions of the macros. This also removes the unused do_load_up_fpu and do_load_up_altivec, which were in fact buggy since they didn't create large enough stack frames to account for the fact that load_up_fpu and load_up_altivec are not designed to be called from C and assume that their caller's stack frame is an interrupt frame. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 25 9月, 2013 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
We've been keeping that field in thread_struct for a while, it contains the "limit" of the current stack pointer and is meant to be used for detecting stack overflows. It has a few problems however: - First, it was never actually *used* on 64-bit. Set and updated but not actually exploited - When switching stack to/from irq and softirq stacks, it's update is racy unless we hard disable interrupts, which is costly. This is fine on 32-bit as we don't soft-disable there but not on 64-bit. Thus rather than fixing 2 in order to implement 1 in some hypothetical future, let's remove the code completely from 64-bit. In order to avoid a clutter of ifdef's, we remove the updates from C code completely during interrupt stack switching, and instead maintain it from the asm helper that is used to do the stack switching in the first place. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 14 8月, 2013 1 次提交
-
-
由 Kevin Hao 提交于
In the current kernel, the function flush_fp_to_thread() is not dependent on CONFIG_PPC_FPU. So most invocations of this function is not wrapped by CONFIG_PPC_FPU. Even through we don't really save the FPRs to the thread struct if CONFIG_PPC_FPU is not enabled, but there does have some runtime overhead such as the check for tsk->thread.regs and preempt disable and enable. It really make no sense to do that. So make it a nop when CONFIG_PPC_FPU is disabled. Also remove the wrapped #ifdef CONFIG_PPC_FPU when invoking this function. Signed-off-by: NKevin Hao <haokexin@gmail.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 09 8月, 2013 1 次提交
-
-
由 Michael Neuling 提交于
This moves us to save the Target Address Register (TAR) a earlier in __switch_to. It introduces a new function save_tar() to do this. We need to save the TAR earlier as we will overwrite it in the transactional memory reclaim/recheckpoint path. We are going to do this in a subsequent patch which will fix saving the TAR register when it's modified inside a transaction. Signed-off-by: NMichael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 01 7月, 2013 1 次提交
-
-
由 Michael Ellerman 提交于
Add support for EBB (Event Based Branches) on 64-bit book3s. See the included documentation for more details. EBBs are a feature which allows the hardware to branch directly to a specified user space address when a PMU event overflows. This can be used by programs for self-monitoring with no kernel involvement in the inner loop. Most of the logic is in the generic book3s code, primarily to avoid a proliferation of PMU callbacks. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 15 6月, 2013 1 次提交
-
-
由 Michael Ellerman 提交于
It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc "powerpc: Rework runlatch code" by myself --BenH ] CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 10 6月, 2013 1 次提交
-
-
由 Michael Neuling 提交于
When introducing support for DABRX in 4474ef05, we broke older 32-bit CPUs that don't have that register. Some CPUs have a DABR but not DABRX. Configuration are: - No 32bit CPUs have DABRX but some have DABR. - POWER4+ and below have the DABR but no DABRX. - 970 and POWER5 and above have DABR and DABRX. - POWER8 has DAWR, hence no DABRX. This introduces CPU_FTR_DABRX and sets it on appropriate CPUs. We use the top 64 bits for CPU FTR bits since only 64 bit CPUs have this. Processors that don't have the DABRX will still work as they will fall back to software filtering these breakpoints via perf_exclude_event(). Signed-off-by: NMichael Neuling <mikey@neuling.org> Reported-by: N"Gorelik, Jacob (335F)" <jacob.gorelik@jpl.nasa.gov> cc: stable@vger.kernel.org (v3.9 only) Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 14 5月, 2013 2 次提交
-
-
由 Scott Wood 提交于
MSR_DE is not cleared on entry to the kernel, and we don't clear it explicitly outside of debug code. If we have MSR_DE set in prime_debug_regs(), and the new thread has events enabled in DBCR0 (e.g. ICMP is set in thread->dbsr0, even though it was cleared in the real DBCR0 when the thread got scheduled out), we'll end up taking a debug exception in the kernel when DBCR0 is loaded. DSRR0 will not point to an exception vector, and the kernel ends up hanging at kernel_dbg_exc. Fix this by always clearing MSR_DE when we load new debug state. Another observed source of kernel_dbg_exc hangs is with the branch taken event. If this event is active, but we take a non-debug trap (e.g. a TLB miss or an asynchronous interrupt) before the next branch. We end up taking a branch-taken debug exception on the initial branch instruction of the exception vector, but because the debug exception is DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even checking the DSRR0 address. Fix this by checking for DBSR_BT as well as DBSR_IC, which is what 32-bit does and what the comments suggest was intended in the 64-bit code as well. Signed-off-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Li Zhong 提交于
Saw this warning again, and this time from the ret_from_fork path. It seems we could clear the back chain earlier in copy_thread(), which could cover both path, and also fix potential lockdep usage in schedule_tail(), or exception occurred before we clear the back chain. Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 01 5月, 2013 2 次提交
-
-
由 Tejun Heo 提交于
show_regs() is inherently arch-dependent but it does make sense to print generic debug information and some archs already do albeit in slightly different forms. This patch introduces a generic function to print debug information from show_regs() so that different archs print out the same information and it's much easier to modify what's printed. show_regs_print_info() prints out the same debug info as dump_stack() does plus task and thread_info pointers. * Archs which didn't print debug info now do. alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r, metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc, um, xtensa * Already prints debug info. Replaced with show_regs_print_info(). The printed information is superset of what used to be there. arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86 * s390 is special in that it used to print arch-specific information along with generic debug info. Heiko and Martin think that the arch-specific extra isn't worth keeping s390 specfic implementation. Converted to use the generic version. Note that now all archs print the debug info before actual register dumps. An example BUG() dump follows. kernel BUG at /work/os/work/kernel/workqueue.c:4841! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000 RIP: 0010:[<ffffffff8234a07e>] [<ffffffff8234a07e>] init_workqueues+0x4/0x6 RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246 RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650 0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760 Call Trace: [<ffffffff81000312>] do_one_initcall+0x122/0x170 [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8 [<ffffffff81c47760>] ? rest_init+0x140/0x140 [<ffffffff81c4776e>] kernel_init+0xe/0xf0 [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0 [<ffffffff81c47760>] ? rest_init+0x140/0x140 ... v2: Typo fix in x86-32. v3: CPU number dropped from show_regs_print_info() as dump_stack_print_info() has been updated to print it. s390 specific implementation dropped as requested by s390 maintainers. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NJesper Nilsson <jesper.nilsson@axis.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile bits] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
Both dump_stack() and show_stack() are currently implemented by each architecture. show_stack(NULL, NULL) dumps the backtrace for the current task as does dump_stack(). On some archs, dump_stack() prints extra information - pid, utsname and so on - in addition to the backtrace while the two are identical on other archs. The usages in arch-independent code of the two functions indicate show_stack(NULL, NULL) should print out bare backtrace while dump_stack() is used for debugging purposes when something went wrong, so it does make sense to print additional information on the task which triggered dump_stack(). There's no reason to require archs to implement two separate but mostly identical functions. It leads to unnecessary subtle information. This patch expands the dummy fallback dump_stack() implementation in lib/dump_stack.c such that it prints out debug information (taken from x86) and invokes show_stack(NULL, NULL) and drops arch-specific dump_stack() implementations in all archs except blackfin. Blackfin's dump_stack() does something wonky that I don't understand. Debug information can be printed separately by calling dump_stack_print_info() so that arch-specific dump_stack() implementation can still emit the same debug information. This is used in blackfin. This patch brings the following behavior changes. * On some archs, an extra level in backtrace for show_stack() could be printed. This is because the top frame was determined in dump_stack() on those archs while generic dump_stack() can't do that reliably. It can be compensated by inlining dump_stack() but not sure whether that'd be necessary. * Most archs didn't use to print debug info on dump_stack(). They do now. An example WARN dump follows. WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505() Hardware name: empty Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9 0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48 ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c 0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58 Call Trace: [<ffffffff81c614dc>] dump_stack+0x19/0x1b [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8234a071>] init_workqueues+0x35/0x505 ... v2: CPU number added to the generic debug info as requested by s390 folks and dropped the s390 specific dump_stack(). This loses %ksp from the debug message which the maintainers think isn't important enough to keep the s390-specific dump_stack() implementation. dump_stack_print_info() is moved to kernel/printk.c from lib/dump_stack.c. Because linkage is per objecct file, dump_stack_print_info() living in the same lib file as generic dump_stack() means that archs which implement custom dump_stack() - at this point, only blackfin - can't use dump_stack_print_info() as that will bring in the generic version of dump_stack() too. v1 The v1 patch broke build on blackfin due to this issue. The build breakage was reported by Fengguang Wu. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NVineet Gupta <vgupta@synopsys.com> Acked-by: NJesper Nilsson <jesper.nilsson@axis.com> Acked-by: NVineet Gupta <vgupta@synopsys.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390 bits] Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 4月, 2013 1 次提交
-
-
由 Oleg Nesterov 提交于
arch_dup_task_struct() does flush_ptrace_hw_breakpoint(src), this destroys the parent's breakpoints for no reason. We should clear child->thread.ptrace_bps[] copied by dup_task_struct() instead. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 10 4月, 2013 1 次提交
-
-
由 Michael Neuling 提交于
We can't compile a kernel with CONFIG_ALTIVEC=n when CONFIG_PPC_TRANSACTIONAL_MEM=y. We currently get: arch/powerpc/kernel/tm.S:320: Error: unsupported relocation against THREAD_VSCR arch/powerpc/kernel/tm.S:323: Error: unsupported relocation against THREAD_VR0 arch/powerpc/kernel/tm.S:323: Error: unsupported relocation against THREAD_VR0 etc. The below fixes this with a sprinkling of #ifdefs. This was found by mpe with kisskb: http://kisskb.ellerman.id.au/kisskb/buildresult/8539442/Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
-
- 15 2月, 2013 1 次提交
-
-
由 Michael Neuling 提交于
This hooks the new transactional memory code into context switching, FP/VMX/VMX unavailable and exception return. Signed-off-by: NMatt Evans <matt@ozlabs.org> Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-