1. 10 11月, 2014 1 次提交
  2. 31 10月, 2014 1 次提交
    • A
      powerpc: do_notify_resume can be called with bad thread_info flags argument · 808be314
      Anton Blanchard 提交于
      Back in 7230c564 ("powerpc: Rework lazy-interrupt handling") we
      added a call out to restore_interrupts() (written in c) before calling
      do_notify_resume:
      
              bl      restore_interrupts
              addi    r3,r1,STACK_FRAME_OVERHEAD
              bl      do_notify_resume
      
      Unfortunately do_notify_resume takes two arguments, the second one
      being the thread_info flags:
      
      void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
      
      We do populate r4 (the second argument) earlier, but
      restore_interrupts() is free to muck it up all it wants. My guess is
      the gcc compiler gods shone down on us and its register allocator
      never used r4. Sometimes, rarely, luck is on our side.
      
      LLVM on the other hand did trample r4.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      808be314
  3. 05 8月, 2014 1 次提交
  4. 28 7月, 2014 1 次提交
  5. 11 6月, 2014 1 次提交
    • S
      powerpc: Correct DSCR during TM context switch · 96d01610
      Sam bobroff 提交于
      Correct the DSCR SPR becoming temporarily corrupted if a task is
      context switched during a transaction.
      
      The problem occurs while suspending the task and is caused by saving
      the DSCR to thread.dscr after it has already been set to the CPU's
      default value:
      
      __switch_to() calls __switch_to_tm()
      	which calls tm_reclaim_task()
      	which calls tm_reclaim_thread()
      	which calls tm_reclaim()
      		where the DSCR is set to the CPU's default
      __switch_to() calls _switch()
      		where thread.dscr is set to the DSCR
      
      When the task is resumed, it's transaction will be doomed (as usual)
      and the DSCR SPR will be corrupted, although the checkpointed value
      will be correct. Therefore the DSCR will be immediately corrected by
      the transaction aborting, unless it has been suspended. In that case
      the incorrect value can be seen by the task until it resumes the
      transaction.
      
      The fix is to treat the DSCR similarly to the TAR and save it early
      in __switch_to().
      
      A program exposing the problem is added to the kernel self tests as:
      tools/testing/selftests/powerpc/tm/tm-resched-dscr.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      CC: <stable@vger.kernel.org> [v3.10+]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      96d01610
  6. 28 5月, 2014 1 次提交
    • S
      powerpc: Fix regression of per-CPU DSCR setting · 1739ea9e
      Sam bobroff 提交于
      Since commit "efcac658 powerpc: Per process DSCR + some fixes (try#4)"
      it is no longer possible to set the DSCR on a per-CPU basis.
      
      The old behaviour was to minipulate the DSCR SPR directly but this is no
      longer sufficient: the value is quickly overwritten by context switching.
      
      This patch stores the per-CPU DSCR value in a kernel variable rather than
      directly in the SPR and it is used whenever a process has not set the DSCR
      itself. The sysfs interface (/sys/devices/system/cpu/cpuN/dscr) is unchanged.
      
      Writes to the old global default (/sys/devices/system/cpu/dscr_default)
      now set all of the per-CPU values and reads return the last written value.
      
      The new per-CPU default is added to the paca_struct and is used everywhere
      outside of sysfs.c instead of the old global default.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1739ea9e
  7. 23 4月, 2014 6 次提交
  8. 15 1月, 2014 2 次提交
    • P
      powerpc: Don't corrupt transactional state when using FP/VMX in kernel · d31626f7
      Paul Mackerras 提交于
      Currently, when we have a process using the transactional memory
      facilities on POWER8 (that is, the processor is in transactional
      or suspended state), and the process enters the kernel and the
      kernel then uses the floating-point or vector (VMX/Altivec) facility,
      we end up corrupting the user-visible FP/VMX/VSX state.  This
      happens, for example, if a page fault causes a copy-on-write
      operation, because the copy_page function will use VMX to do the
      copy on POWER8.  The test program below demonstrates the bug.
      
      The bug happens because when FP/VMX state for a transactional process
      is stored in the thread_struct, we store the checkpointed state in
      .fp_state/.vr_state and the transactional (current) state in
      .transact_fp/.transact_vr.  However, when the kernel wants to use
      FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(),
      which saves the current state in .fp_state/.vr_state.  Furthermore,
      when we return to the user process we return with FP/VMX/VSX
      disabled.  The next time the process uses FP/VMX/VSX, we don't know
      which set of state (the current register values, .fp_state/.vr_state,
      or .transact_fp/.transact_vr) we should be using, since we have no
      way to tell if we are still in the same transaction, and if not,
      whether the previous transaction succeeded or failed.
      
      Thus it is necessary to strictly adhere to the rule that if FP has
      been enabled at any point in a transaction, we must keep FP enabled
      for the user process with the current transactional state in the
      FP registers, until we detect that it is no longer in a transaction.
      Similarly for VMX; once enabled it must stay enabled until the
      process is no longer transactional.
      
      In order to keep this rule, we add a new thread_info flag which we
      test when returning from the kernel to userspace, called TIF_RESTORE_TM.
      This flag indicates that there is FP/VMX/VSX state to be restored
      before entering userspace, and when it is set the .tm_orig_msr field
      in the thread_struct indicates what state needs to be restored.
      The restoration is done by restore_tm_state().  The TIF_RESTORE_TM
      bit is set by new giveup_fpu/altivec_maybe_transactional helpers,
      which are called from enable_kernel_fp/altivec, giveup_vsx, and
      flush_fp/altivec_to_thread instead of giveup_fpu/altivec.
      
      The other thing to be done is to get the transactional FP/VMX/VSX
      state from .fp_state/.vr_state when doing reclaim, if that state
      has been saved there by giveup_fpu/altivec_maybe_transactional.
      Having done this, we set the FP/VMX bit in the thread's MSR after
      reclaim to indicate that that part of the state is now valid
      (having been reclaimed from the processor's checkpointed state).
      
      Finally, in the signal handling code, we move the clearing of the
      transactional state bits in the thread's MSR a bit earlier, before
      calling flush_fp_to_thread(), so that we don't unnecessarily set
      the TIF_RESTORE_TM bit.
      
      This is the test program:
      
      /* Michael Neuling 4/12/2013
       *
       * See if the altivec state is leaked out of an aborted transaction due to
       * kernel vmx copy loops.
       *
       *   gcc -m64 htm_vmxcopy.c -o htm_vmxcopy
       *
       */
      
      /* We don't use all of these, but for reference: */
      
      int main(int argc, char *argv[])
      {
      	long double vecin = 1.3;
      	long double vecout;
      	unsigned long pgsize = getpagesize();
      	int i;
      	int fd;
      	int size = pgsize*16;
      	char tmpfile[] = "/tmp/page_faultXXXXXX";
      	char buf[pgsize];
      	char *a;
      	uint64_t aborted = 0;
      
      	fd = mkstemp(tmpfile);
      	assert(fd >= 0);
      
      	memset(buf, 0, pgsize);
      	for (i = 0; i < size; i += pgsize)
      		assert(write(fd, buf, pgsize) == pgsize);
      
      	unlink(tmpfile);
      
      	a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
      	assert(a != MAP_FAILED);
      
      	asm __volatile__(
      		"lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value
      		TBEGIN
      		"beq	3f ;"
      		TSUSPEND
      		"xxlxor 40,40,40 ; " // set 40 to 0
      		"std	5, 0(%[map]) ;" // cause kernel vmx copy page
      		TABORT
      		TRESUME
      		TEND
      		"li	%[res], 0 ;"
      		"b	5f ;"
      		"3: ;" // Abort handler
      		"li	%[res], 1 ;"
      		"5: ;"
      		"stxvd2x 40,0,%[vecoutptr] ; "
      		: [res]"=r"(aborted)
      		: [vecinptr]"r"(&vecin),
      		  [vecoutptr]"r"(&vecout),
      		  [map]"r"(a)
      		: "memory", "r0", "r3", "r4", "r5", "r6", "r7");
      
      	if (aborted && (vecin != vecout)){
      		printf("FAILED: vector state leaked on abort %f != %f\n",
      		       (double)vecin, (double)vecout);
      		exit(1);
      	}
      
      	munmap(a, size);
      
      	close(fd);
      
      	printf("PASSED!\n");
      	return 0;
      }
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d31626f7
    • M
      Move precessing of MCE queued event out from syscall exit path. · 30c82635
      Mahesh Salgaonkar 提交于
      Huge Dickins reported an issue that b5ff4211
      "powerpc/book3s: Queue up and process delayed MCE events" breaks the
      PowerMac G5 boot. This patch fixes it by moving the mce even processing
      away from syscall exit, which was wrong to do that in first place, and
      using irq work framework to delay processing of mce event.
      
      Reported-by: Hugh Dickins <hughd@google.com
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      30c82635
  9. 05 12月, 2013 1 次提交
  10. 06 11月, 2013 1 次提交
  11. 11 10月, 2013 2 次提交
  12. 27 8月, 2013 1 次提交
  13. 14 8月, 2013 2 次提交
  14. 09 8月, 2013 2 次提交
    • M
      powerpc: Save the TAR register earlier · c2d52644
      Michael Neuling 提交于
      This moves us to save the Target Address Register (TAR) a earlier in
      __switch_to.  It introduces a new function save_tar() to do this.
      
      We need to save the TAR earlier as we will overwrite it in the transactional
      memory reclaim/recheckpoint path.  We are going to do this in a subsequent
      patch which will fix saving the TAR register when it's modified inside a
      transaction.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Cc: <stable@vger.kernel.org> [v3.10]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c2d52644
    • M
      powerpc: Fix context switch DSCR on POWER8 · 2517617e
      Michael Neuling 提交于
      POWER8 allows the DSCR to be accessed directly from userspace via a new SPR
      number 0x3 (Rather than 0x11.  DSCR SPR number 0x11 is still used on POWER8 but
      like POWER7, is only accessible in HV and OS modes).  Currently, we allow this
      by setting H/FSCR DSCR bit on boot.
      
      Unfortunately this doesn't work, as the kernel needs to see the DSCR change so
      that it knows to no longer restore the system wide version of DSCR on context
      switch (ie. to set thread.dscr_inherit).
      
      This clears the H/FSCR DSCR bit initially.  If a process then accesses the DSCR
      (via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in
      facility_unavailable_exception().
      
      We also change _switch() so that we set or clear the H/FSCR DSCR bit based on
      the thread.dscr_inherit.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Cc: <stable@vger.kernel.org> [v3.10]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2517617e
  15. 20 6月, 2013 1 次提交
    • B
      powerpc: Restore dbcr0 on user space exit · 13d543cd
      Bharat Bhushan 提交于
      On BookE (Branch taken + Single Step) is as same as Branch Taken
      on BookS and in Linux we simulate BookS behavior for BookE as well.
      When doing so, in Branch taken handling we want to set DBCR0_IC but
      we update the current->thread->dbcr0 and not DBCR0.
      
      Now on 64bit the current->thread.dbcr0 (and other debug registers)
      is synchronized ONLY on context switch flow. But after handling
      Branch taken in debug exception if we return back to user space
      without context switch then single stepping change (DBCR0_ICMP)
      does not get written in h/w DBCR0 and Instruction Complete exception
      does not happen.
      
      This fixes using ptrace reliably on BookE-PowerPC
      
      lmbench latency test (lat_syscall) Results are (they varies a little
      on each run)
      
      1) ./lat_syscall <action> /dev/shm/uImage
      
      action:	Open	read	write	stat	fstat	null
      Before:	3.8618	0.2017	0.2851	1.6789	0.2256	0.0856
      After:	3.8580	0.2017	0.2851	1.6955	0.2255	0.0856
      
      1) ./lat_syscall -P 2 -N 10 <action> /dev/shm/uImage
      action:	Open	read	write	stat	fstat	null
      Before:	4.1388	0.2238	0.3066	1.7106	0.2256	0.0856
      After:	4.1413	0.2236	0.3062	1.7107	0.2256	0.0856
      
      [ Slightly modified to avoid extra branch in the fast path
        on Book3S and fix build on all non-BookE 64-bit -- BenH
      ]
      Signed-off-by: NBharat Bhushan <bharat.bhushan@freescale.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      13d543cd
  16. 10 6月, 2013 1 次提交
  17. 01 6月, 2013 1 次提交
  18. 24 5月, 2013 1 次提交
  19. 14 5月, 2013 2 次提交
  20. 02 5月, 2013 2 次提交
  21. 15 4月, 2013 2 次提交
  22. 11 4月, 2013 1 次提交
  23. 15 2月, 2013 1 次提交
  24. 08 2月, 2013 1 次提交
  25. 29 1月, 2013 1 次提交
  26. 28 1月, 2013 1 次提交
    • F
      cputime: Generic on-demand virtual cputime accounting · abf917cd
      Frederic Weisbecker 提交于
      If we want to stop the tick further idle, we need to be
      able to account the cputime without using the tick.
      
      Virtual based cputime accounting solves that problem by
      hooking into kernel/user boundaries.
      
      However implementing CONFIG_VIRT_CPU_ACCOUNTING require
      low level hooks and involves more overhead. But we already
      have a generic context tracking subsystem that is required
      for RCU needs by archs which plan to shut down the tick
      outside idle.
      
      This patch implements a generic virtual based cputime
      accounting that relies on these generic kernel/user hooks.
      
      There are some upsides of doing this:
      
      - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
      if context tracking is already built (already necessary for RCU in full
      tickless mode).
      
      - We can rely on the generic context tracking subsystem to dynamically
      (de)activate the hooks, so that we can switch anytime between virtual
      and tick based accounting. This way we don't have the overhead
      of the virtual accounting when the tick is running periodically.
      
      And one downside:
      
      - There is probably more overhead than a native virtual based cputime
      accounting. But this relies on hooks that are already set anyway.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      abf917cd
  27. 10 1月, 2013 2 次提交
    • H
      powerpc: Implement PPR save/restore · 44e9309f
      Haren Myneni 提交于
      [PATCH 6/6] powerpc: Implement PPR save/restore
      
      When the task enters in to kernel space, the user defined priority (PPR)
      will be saved in to PACA at the beginning of first level exception
      vector and then copy from PACA to thread_info in second level vector.
      PPR will be restored from thread_info before exits the kernel space.
      
      P7/P8 temporarily raises the thread priority to higher level during
      exception until the program executes HMT_* calls. But it will not modify
      PPR register. So we save PPR value whenever some register is available
      to use and then calls HMT_MEDIUM to increase the priority. This feature
      supports on P7 or later processors.
      
      We save/ restore PPR for all exception vectors except system call entry.
      GLIBC will be saving / restore for system calls. So the default PPR
      value (3) will be set for the system call exit when the task returned
      to the user space.
      Signed-off-by: NHaren Myneni <haren@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      44e9309f
    • H
      powerpc: Move branch instruction from ACCOUNT_CPU_USER_ENTRY to caller · 5d75b264
      Haren Myneni 提交于
      [PATCH 1/6] powerpc: Move branch instruction from ACCOUNT_CPU_USER_ENTRY to caller
      
      The first instruction in ACCOUNT_CPU_USER_ENTRY is 'beq' which checks for
      exceptions coming from kernel mode. PPR value will be saved immediately after
      ACCOUNT_CPU_USER_ENTRY and is also for user level exceptions. So moved this
      branch instruction in the caller code.
      Signed-off-by: NHaren Myneni <haren@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5d75b264
新手
引导
客服 返回
顶部