1. 02 2月, 2015 1 次提交
    • M
      powerpc/kernel: Make syscall_exit a local label · 4c3b2168
      Michael Ellerman 提交于
      Currently when we back trace something that is in a syscall we see
      something like this:
      
      [c000000000000000] [c000000000000000] SyS_read+0x6c/0x110
      [c000000000000000] [c000000000000000] syscall_exit+0x0/0x98
      
      Although it's entirely correct, seeing syscall_exit at the bottom can be
      confusing - we were exiting from a syscall and then called SyS_read() ?
      
      If we instead change syscall_exit to be a local label we get something
      more intuitive:
      
      [c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110
      [c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0
      
      ie. we were handling a system call, and it was SyS_read().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4c3b2168
  2. 23 1月, 2015 1 次提交
  3. 10 11月, 2014 2 次提交
  4. 31 10月, 2014 1 次提交
    • A
      powerpc: do_notify_resume can be called with bad thread_info flags argument · 808be314
      Anton Blanchard 提交于
      Back in 7230c564 ("powerpc: Rework lazy-interrupt handling") we
      added a call out to restore_interrupts() (written in c) before calling
      do_notify_resume:
      
              bl      restore_interrupts
              addi    r3,r1,STACK_FRAME_OVERHEAD
              bl      do_notify_resume
      
      Unfortunately do_notify_resume takes two arguments, the second one
      being the thread_info flags:
      
      void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
      
      We do populate r4 (the second argument) earlier, but
      restore_interrupts() is free to muck it up all it wants. My guess is
      the gcc compiler gods shone down on us and its register allocator
      never used r4. Sometimes, rarely, luck is on our side.
      
      LLVM on the other hand did trample r4.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      808be314
  5. 05 8月, 2014 1 次提交
  6. 28 7月, 2014 1 次提交
  7. 11 6月, 2014 1 次提交
    • S
      powerpc: Correct DSCR during TM context switch · 96d01610
      Sam bobroff 提交于
      Correct the DSCR SPR becoming temporarily corrupted if a task is
      context switched during a transaction.
      
      The problem occurs while suspending the task and is caused by saving
      the DSCR to thread.dscr after it has already been set to the CPU's
      default value:
      
      __switch_to() calls __switch_to_tm()
      	which calls tm_reclaim_task()
      	which calls tm_reclaim_thread()
      	which calls tm_reclaim()
      		where the DSCR is set to the CPU's default
      __switch_to() calls _switch()
      		where thread.dscr is set to the DSCR
      
      When the task is resumed, it's transaction will be doomed (as usual)
      and the DSCR SPR will be corrupted, although the checkpointed value
      will be correct. Therefore the DSCR will be immediately corrected by
      the transaction aborting, unless it has been suspended. In that case
      the incorrect value can be seen by the task until it resumes the
      transaction.
      
      The fix is to treat the DSCR similarly to the TAR and save it early
      in __switch_to().
      
      A program exposing the problem is added to the kernel self tests as:
      tools/testing/selftests/powerpc/tm/tm-resched-dscr.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      CC: <stable@vger.kernel.org> [v3.10+]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      96d01610
  8. 28 5月, 2014 1 次提交
    • S
      powerpc: Fix regression of per-CPU DSCR setting · 1739ea9e
      Sam bobroff 提交于
      Since commit "efcac658 powerpc: Per process DSCR + some fixes (try#4)"
      it is no longer possible to set the DSCR on a per-CPU basis.
      
      The old behaviour was to minipulate the DSCR SPR directly but this is no
      longer sufficient: the value is quickly overwritten by context switching.
      
      This patch stores the per-CPU DSCR value in a kernel variable rather than
      directly in the SPR and it is used whenever a process has not set the DSCR
      itself. The sysfs interface (/sys/devices/system/cpu/cpuN/dscr) is unchanged.
      
      Writes to the old global default (/sys/devices/system/cpu/dscr_default)
      now set all of the per-CPU values and reads return the last written value.
      
      The new per-CPU default is added to the paca_struct and is used everywhere
      outside of sysfs.c instead of the old global default.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1739ea9e
  9. 23 4月, 2014 6 次提交
  10. 15 1月, 2014 2 次提交
    • P
      powerpc: Don't corrupt transactional state when using FP/VMX in kernel · d31626f7
      Paul Mackerras 提交于
      Currently, when we have a process using the transactional memory
      facilities on POWER8 (that is, the processor is in transactional
      or suspended state), and the process enters the kernel and the
      kernel then uses the floating-point or vector (VMX/Altivec) facility,
      we end up corrupting the user-visible FP/VMX/VSX state.  This
      happens, for example, if a page fault causes a copy-on-write
      operation, because the copy_page function will use VMX to do the
      copy on POWER8.  The test program below demonstrates the bug.
      
      The bug happens because when FP/VMX state for a transactional process
      is stored in the thread_struct, we store the checkpointed state in
      .fp_state/.vr_state and the transactional (current) state in
      .transact_fp/.transact_vr.  However, when the kernel wants to use
      FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(),
      which saves the current state in .fp_state/.vr_state.  Furthermore,
      when we return to the user process we return with FP/VMX/VSX
      disabled.  The next time the process uses FP/VMX/VSX, we don't know
      which set of state (the current register values, .fp_state/.vr_state,
      or .transact_fp/.transact_vr) we should be using, since we have no
      way to tell if we are still in the same transaction, and if not,
      whether the previous transaction succeeded or failed.
      
      Thus it is necessary to strictly adhere to the rule that if FP has
      been enabled at any point in a transaction, we must keep FP enabled
      for the user process with the current transactional state in the
      FP registers, until we detect that it is no longer in a transaction.
      Similarly for VMX; once enabled it must stay enabled until the
      process is no longer transactional.
      
      In order to keep this rule, we add a new thread_info flag which we
      test when returning from the kernel to userspace, called TIF_RESTORE_TM.
      This flag indicates that there is FP/VMX/VSX state to be restored
      before entering userspace, and when it is set the .tm_orig_msr field
      in the thread_struct indicates what state needs to be restored.
      The restoration is done by restore_tm_state().  The TIF_RESTORE_TM
      bit is set by new giveup_fpu/altivec_maybe_transactional helpers,
      which are called from enable_kernel_fp/altivec, giveup_vsx, and
      flush_fp/altivec_to_thread instead of giveup_fpu/altivec.
      
      The other thing to be done is to get the transactional FP/VMX/VSX
      state from .fp_state/.vr_state when doing reclaim, if that state
      has been saved there by giveup_fpu/altivec_maybe_transactional.
      Having done this, we set the FP/VMX bit in the thread's MSR after
      reclaim to indicate that that part of the state is now valid
      (having been reclaimed from the processor's checkpointed state).
      
      Finally, in the signal handling code, we move the clearing of the
      transactional state bits in the thread's MSR a bit earlier, before
      calling flush_fp_to_thread(), so that we don't unnecessarily set
      the TIF_RESTORE_TM bit.
      
      This is the test program:
      
      /* Michael Neuling 4/12/2013
       *
       * See if the altivec state is leaked out of an aborted transaction due to
       * kernel vmx copy loops.
       *
       *   gcc -m64 htm_vmxcopy.c -o htm_vmxcopy
       *
       */
      
      /* We don't use all of these, but for reference: */
      
      int main(int argc, char *argv[])
      {
      	long double vecin = 1.3;
      	long double vecout;
      	unsigned long pgsize = getpagesize();
      	int i;
      	int fd;
      	int size = pgsize*16;
      	char tmpfile[] = "/tmp/page_faultXXXXXX";
      	char buf[pgsize];
      	char *a;
      	uint64_t aborted = 0;
      
      	fd = mkstemp(tmpfile);
      	assert(fd >= 0);
      
      	memset(buf, 0, pgsize);
      	for (i = 0; i < size; i += pgsize)
      		assert(write(fd, buf, pgsize) == pgsize);
      
      	unlink(tmpfile);
      
      	a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
      	assert(a != MAP_FAILED);
      
      	asm __volatile__(
      		"lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value
      		TBEGIN
      		"beq	3f ;"
      		TSUSPEND
      		"xxlxor 40,40,40 ; " // set 40 to 0
      		"std	5, 0(%[map]) ;" // cause kernel vmx copy page
      		TABORT
      		TRESUME
      		TEND
      		"li	%[res], 0 ;"
      		"b	5f ;"
      		"3: ;" // Abort handler
      		"li	%[res], 1 ;"
      		"5: ;"
      		"stxvd2x 40,0,%[vecoutptr] ; "
      		: [res]"=r"(aborted)
      		: [vecinptr]"r"(&vecin),
      		  [vecoutptr]"r"(&vecout),
      		  [map]"r"(a)
      		: "memory", "r0", "r3", "r4", "r5", "r6", "r7");
      
      	if (aborted && (vecin != vecout)){
      		printf("FAILED: vector state leaked on abort %f != %f\n",
      		       (double)vecin, (double)vecout);
      		exit(1);
      	}
      
      	munmap(a, size);
      
      	close(fd);
      
      	printf("PASSED!\n");
      	return 0;
      }
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d31626f7
    • M
      Move precessing of MCE queued event out from syscall exit path. · 30c82635
      Mahesh Salgaonkar 提交于
      Huge Dickins reported an issue that b5ff4211
      "powerpc/book3s: Queue up and process delayed MCE events" breaks the
      PowerMac G5 boot. This patch fixes it by moving the mce even processing
      away from syscall exit, which was wrong to do that in first place, and
      using irq work framework to delay processing of mce event.
      
      Reported-by: Hugh Dickins <hughd@google.com
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      30c82635
  11. 05 12月, 2013 1 次提交
  12. 06 11月, 2013 1 次提交
  13. 11 10月, 2013 2 次提交
  14. 27 8月, 2013 1 次提交
  15. 14 8月, 2013 2 次提交
  16. 09 8月, 2013 2 次提交
    • M
      powerpc: Save the TAR register earlier · c2d52644
      Michael Neuling 提交于
      This moves us to save the Target Address Register (TAR) a earlier in
      __switch_to.  It introduces a new function save_tar() to do this.
      
      We need to save the TAR earlier as we will overwrite it in the transactional
      memory reclaim/recheckpoint path.  We are going to do this in a subsequent
      patch which will fix saving the TAR register when it's modified inside a
      transaction.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Cc: <stable@vger.kernel.org> [v3.10]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c2d52644
    • M
      powerpc: Fix context switch DSCR on POWER8 · 2517617e
      Michael Neuling 提交于
      POWER8 allows the DSCR to be accessed directly from userspace via a new SPR
      number 0x3 (Rather than 0x11.  DSCR SPR number 0x11 is still used on POWER8 but
      like POWER7, is only accessible in HV and OS modes).  Currently, we allow this
      by setting H/FSCR DSCR bit on boot.
      
      Unfortunately this doesn't work, as the kernel needs to see the DSCR change so
      that it knows to no longer restore the system wide version of DSCR on context
      switch (ie. to set thread.dscr_inherit).
      
      This clears the H/FSCR DSCR bit initially.  If a process then accesses the DSCR
      (via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in
      facility_unavailable_exception().
      
      We also change _switch() so that we set or clear the H/FSCR DSCR bit based on
      the thread.dscr_inherit.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Cc: <stable@vger.kernel.org> [v3.10]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2517617e
  17. 20 6月, 2013 1 次提交
    • B
      powerpc: Restore dbcr0 on user space exit · 13d543cd
      Bharat Bhushan 提交于
      On BookE (Branch taken + Single Step) is as same as Branch Taken
      on BookS and in Linux we simulate BookS behavior for BookE as well.
      When doing so, in Branch taken handling we want to set DBCR0_IC but
      we update the current->thread->dbcr0 and not DBCR0.
      
      Now on 64bit the current->thread.dbcr0 (and other debug registers)
      is synchronized ONLY on context switch flow. But after handling
      Branch taken in debug exception if we return back to user space
      without context switch then single stepping change (DBCR0_ICMP)
      does not get written in h/w DBCR0 and Instruction Complete exception
      does not happen.
      
      This fixes using ptrace reliably on BookE-PowerPC
      
      lmbench latency test (lat_syscall) Results are (they varies a little
      on each run)
      
      1) ./lat_syscall <action> /dev/shm/uImage
      
      action:	Open	read	write	stat	fstat	null
      Before:	3.8618	0.2017	0.2851	1.6789	0.2256	0.0856
      After:	3.8580	0.2017	0.2851	1.6955	0.2255	0.0856
      
      1) ./lat_syscall -P 2 -N 10 <action> /dev/shm/uImage
      action:	Open	read	write	stat	fstat	null
      Before:	4.1388	0.2238	0.3066	1.7106	0.2256	0.0856
      After:	4.1413	0.2236	0.3062	1.7107	0.2256	0.0856
      
      [ Slightly modified to avoid extra branch in the fast path
        on Book3S and fix build on all non-BookE 64-bit -- BenH
      ]
      Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      13d543cd
  18. 10 6月, 2013 1 次提交
  19. 01 6月, 2013 1 次提交
  20. 24 5月, 2013 1 次提交
  21. 14 5月, 2013 2 次提交
  22. 02 5月, 2013 2 次提交
  23. 15 4月, 2013 2 次提交
    • K
      powerpc: add a missing label in resume_kernel · d8b92292
      Kevin Hao 提交于
      A label 0 was missed in the patch a9c4e541 (powerpc/kprobe: Complete
      kprobe and migrate exception frame). This will cause the kernel
      branch to an undetermined address if there really has a conflict when
      updating the thread flags.
      Signed-off-by: NKevin Hao <haokexin@gmail.com>
      Cc: stable@vger.kernel.org
      Acked-By: NTiejun Chen <tiejun.chen@windriver.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      d8b92292
    • A
      powerpc: Fix audit crash due to save/restore PPR changes · 05e38e5d
      Alistair Popple 提交于
      The current mainline crashes when hitting userspace with the following:
      
      kernel BUG at kernel/auditsc.c:1769!
      cpu 0x1: Vector: 700 (Program Check) at [c000000023883a60]
          pc: c0000000001047a8: .__audit_syscall_entry+0x38/0x130
          lr: c00000000000ed64: .do_syscall_trace_enter+0xc4/0x270
          sp: c000000023883ce0
         msr: 8000000000029032
        current = 0xc000000023800000
        paca    = 0xc00000000f080380   softe: 0        irq_happened: 0x01
          pid   = 1629, comm = start_udev
      kernel BUG at kernel/auditsc.c:1769!
      enter ? for help
      [c000000023883d80] c00000000000ed64 .do_syscall_trace_enter+0xc4/0x270
      [c000000023883e30] c000000000009b08 syscall_dotrace+0xc/0x38
       --- Exception: c00 (System Call) at 0000008010ec50dc
      
      Bisecting found the following patch caused it:
      
      commit 44e9309f
      Author: Haren Myneni <haren@linux.vnet.ibm.com>
      powerpc: Implement PPR save/restore
      
      It was found this patch corrupted r9 when calling
      SET_DEFAULT_THREAD_PPR()
      
      Using r10 as a scratch register instead of r9 solved the problem.
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      Acked-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      05e38e5d
  24. 11 4月, 2013 1 次提交
  25. 15 2月, 2013 1 次提交
  26. 08 2月, 2013 1 次提交
  27. 29 1月, 2013 1 次提交