1. 12 1月, 2015 1 次提交
    • M
      powerpc: Work around gcc bug in current_thread_info() · a87e810f
      Michael Ellerman 提交于
      In commit a3e5b356 "powerpc: Don't use local named register variable
      in current_thread_info" Anton changed the way we did current_thread_info()
      to accommodate LLVM, and it was not meant to have any effect elsewhere.
      
      Unfortunately it has exposed a gcc bug, where r1 gets copied into
      another register and then gcc uses that register to restore the toc
      after a function call, even when that register is volatile and has been
      clobbered by the function call.
      
      We could revert Anton's patch, but it's not clear the original code is
      safe either, we may just have been lucky.
      
      The cleanest solution is to just use the existing CURRENT_THREAD_INFO()
      asm macro, and call it using inline asm.
      
      Segher points out we don't need volatile on the asm, if the result of
      the shift is unused it's fine for the compiler to elide it.
      
      Fixes: a3e5b356 ("powerpc: Don't use local named register variable in current_thread_info")
      Reported-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a87e810f
  2. 10 11月, 2014 1 次提交
  3. 15 1月, 2014 2 次提交
    • P
      powerpc: Don't corrupt transactional state when using FP/VMX in kernel · d31626f7
      Paul Mackerras 提交于
      Currently, when we have a process using the transactional memory
      facilities on POWER8 (that is, the processor is in transactional
      or suspended state), and the process enters the kernel and the
      kernel then uses the floating-point or vector (VMX/Altivec) facility,
      we end up corrupting the user-visible FP/VMX/VSX state.  This
      happens, for example, if a page fault causes a copy-on-write
      operation, because the copy_page function will use VMX to do the
      copy on POWER8.  The test program below demonstrates the bug.
      
      The bug happens because when FP/VMX state for a transactional process
      is stored in the thread_struct, we store the checkpointed state in
      .fp_state/.vr_state and the transactional (current) state in
      .transact_fp/.transact_vr.  However, when the kernel wants to use
      FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(),
      which saves the current state in .fp_state/.vr_state.  Furthermore,
      when we return to the user process we return with FP/VMX/VSX
      disabled.  The next time the process uses FP/VMX/VSX, we don't know
      which set of state (the current register values, .fp_state/.vr_state,
      or .transact_fp/.transact_vr) we should be using, since we have no
      way to tell if we are still in the same transaction, and if not,
      whether the previous transaction succeeded or failed.
      
      Thus it is necessary to strictly adhere to the rule that if FP has
      been enabled at any point in a transaction, we must keep FP enabled
      for the user process with the current transactional state in the
      FP registers, until we detect that it is no longer in a transaction.
      Similarly for VMX; once enabled it must stay enabled until the
      process is no longer transactional.
      
      In order to keep this rule, we add a new thread_info flag which we
      test when returning from the kernel to userspace, called TIF_RESTORE_TM.
      This flag indicates that there is FP/VMX/VSX state to be restored
      before entering userspace, and when it is set the .tm_orig_msr field
      in the thread_struct indicates what state needs to be restored.
      The restoration is done by restore_tm_state().  The TIF_RESTORE_TM
      bit is set by new giveup_fpu/altivec_maybe_transactional helpers,
      which are called from enable_kernel_fp/altivec, giveup_vsx, and
      flush_fp/altivec_to_thread instead of giveup_fpu/altivec.
      
      The other thing to be done is to get the transactional FP/VMX/VSX
      state from .fp_state/.vr_state when doing reclaim, if that state
      has been saved there by giveup_fpu/altivec_maybe_transactional.
      Having done this, we set the FP/VMX bit in the thread's MSR after
      reclaim to indicate that that part of the state is now valid
      (having been reclaimed from the processor's checkpointed state).
      
      Finally, in the signal handling code, we move the clearing of the
      transactional state bits in the thread's MSR a bit earlier, before
      calling flush_fp_to_thread(), so that we don't unnecessarily set
      the TIF_RESTORE_TM bit.
      
      This is the test program:
      
      /* Michael Neuling 4/12/2013
       *
       * See if the altivec state is leaked out of an aborted transaction due to
       * kernel vmx copy loops.
       *
       *   gcc -m64 htm_vmxcopy.c -o htm_vmxcopy
       *
       */
      
      /* We don't use all of these, but for reference: */
      
      int main(int argc, char *argv[])
      {
      	long double vecin = 1.3;
      	long double vecout;
      	unsigned long pgsize = getpagesize();
      	int i;
      	int fd;
      	int size = pgsize*16;
      	char tmpfile[] = "/tmp/page_faultXXXXXX";
      	char buf[pgsize];
      	char *a;
      	uint64_t aborted = 0;
      
      	fd = mkstemp(tmpfile);
      	assert(fd >= 0);
      
      	memset(buf, 0, pgsize);
      	for (i = 0; i < size; i += pgsize)
      		assert(write(fd, buf, pgsize) == pgsize);
      
      	unlink(tmpfile);
      
      	a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
      	assert(a != MAP_FAILED);
      
      	asm __volatile__(
      		"lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value
      		TBEGIN
      		"beq	3f ;"
      		TSUSPEND
      		"xxlxor 40,40,40 ; " // set 40 to 0
      		"std	5, 0(%[map]) ;" // cause kernel vmx copy page
      		TABORT
      		TRESUME
      		TEND
      		"li	%[res], 0 ;"
      		"b	5f ;"
      		"3: ;" // Abort handler
      		"li	%[res], 1 ;"
      		"5: ;"
      		"stxvd2x 40,0,%[vecoutptr] ; "
      		: [res]"=r"(aborted)
      		: [vecinptr]"r"(&vecin),
      		  [vecoutptr]"r"(&vecout),
      		  [map]"r"(a)
      		: "memory", "r0", "r3", "r4", "r5", "r6", "r7");
      
      	if (aborted && (vecin != vecout)){
      		printf("FAILED: vector state leaked on abort %f != %f\n",
      		       (double)vecin, (double)vecout);
      		exit(1);
      	}
      
      	munmap(a, size);
      
      	close(fd);
      
      	printf("PASSED!\n");
      	return 0;
      }
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d31626f7
    • P
      powerpc: Reclaim two unused thread_info flag bits · ae39c58c
      Paul Mackerras 提交于
      TIF_PERFMON_WORK and TIF_PERFMON_CTXSW are completely unused.  They
      appear to be related to the old perfmon2 code, which has been
      superseded by the perf_event infrastructure.  This removes their
      definitions so that the bits can be used for other purposes.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ae39c58c
  4. 21 11月, 2013 1 次提交
  5. 14 11月, 2013 1 次提交
  6. 14 5月, 2013 1 次提交
  7. 08 4月, 2013 1 次提交
  8. 01 10月, 2012 1 次提交
    • A
      sanitize tsk_is_polling() · 16a80163
      Al Viro 提交于
      Make default just return 0.  The current default (checking
      TIF_POLLING_NRFLAG) is taken to architectures that need it;
      ones that don't do polling in their idle threads don't need
      to defined TIF_POLLING_NRFLAG at all.
      
      ia64 defined both TS_POLLING (used by its tsk_is_polling())
      and TIF_POLLING_NRFLAG (not used at all).  Killed the latter...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      16a80163
  9. 18 9月, 2012 1 次提交
  10. 05 9月, 2012 1 次提交
    • A
      powerpc: Uprobes port to powerpc · 8b7b80b9
      Ananth N Mavinakayanahalli 提交于
      This is the port of uprobes to powerpc. Usage is similar to x86.
      
      [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc
      Added new event:
        probe_libc:malloc    (on 0xb4860)
      
      You can now use it in all perf tools, such as:
      
      	perf record -e probe_libc:malloc -aR sleep 1
      
      [root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20
      [ perf record: Woken up 22 times to write data ]
      [ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ]
      [root@xxxx ~]# ./bin/perf report --stdio
      ...
      
          69.05%           tar  libc-2.12.so   [.] malloc
          28.57%            rm  libc-2.12.so   [.] malloc
           1.32%  avahi-daemon  libc-2.12.so   [.] malloc
           0.58%          bash  libc-2.12.so   [.] malloc
           0.28%          sshd  libc-2.12.so   [.] malloc
           0.08%    irqbalance  libc-2.12.so   [.] malloc
           0.05%         bzip2  libc-2.12.so   [.] malloc
           0.04%         sleep  libc-2.12.so   [.] malloc
           0.03%    multipathd  libc-2.12.so   [.] malloc
           0.01%      sendmail  libc-2.12.so   [.] malloc
           0.01%     automount  libc-2.12.so   [.] malloc
      
      The trap_nr addition patch is a prereq.
      Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8b7b80b9
  11. 11 7月, 2012 1 次提交
  12. 02 6月, 2012 2 次提交
  13. 09 5月, 2012 1 次提交
  14. 08 5月, 2012 1 次提交
  15. 09 3月, 2012 1 次提交
  16. 22 11月, 2011 1 次提交
  17. 26 5月, 2011 1 次提交
    • I
      powerpc/ftrace: Implement raw syscall tracepoints on PowerPC · 02424d89
      Ian Munsie 提交于
      This patch implements the raw syscall tracepoints on PowerPC and exports
      them for ftrace syscalls to use.
      
      To minimise reworking existing code, I slightly re-ordered the thread
      info flags such that the new TIF_SYSCALL_TRACEPOINT bit would still fit
      within the 16 bits of the andi. instruction's UI field. The instructions
      in question are in /arch/powerpc/kernel/entry_{32,64}.S to and the
      _TIF_SYSCALL_T_OR_A with the thread flags to see if system call tracing
      is enabled.
      
      In the case of 64bit PowerPC, arch_syscall_addr and
      arch_syscall_match_sym_name are overridden to allow ftrace syscalls to
      work given the unusual system call table structure and symbol names that
      start with a period.
      Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      02424d89
  18. 25 5月, 2011 1 次提交
    • P
      powerpc: mmu_gather rework · d6bf29b4
      Peter Zijlstra 提交于
      Fix up powerpc to the new mmu_gather stuff.
      
      PPC has an extra batching queue to RCU free the actual pagetable
      allocations, use the ARCH extentions for that for now.
      
      For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the
      hardware hash-table, keep using per-cpu arrays but flush on context switch
      and use a TLF bit to track the lazy_mmu state.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d6bf29b4
  19. 23 3月, 2011 1 次提交
  20. 14 5月, 2010 1 次提交
  21. 01 2月, 2010 1 次提交
  22. 11 7月, 2009 1 次提交
  23. 23 2月, 2009 1 次提交
  24. 15 2月, 2009 1 次提交
    • Y
      powerpc/44x: Support for 256KB PAGE_SIZE · e1240122
      Yuri Tikhonov 提交于
      This patch adds support for 256KB pages on ppc44x-based boards.
      
      For simplification of implementation with 256KB pages we still assume
      2-level paging. As a side effect this leads to wasting extra memory space
      reserved for PTE tables: only 1/4 of pages allocated for PTEs are
      actually used. But this may be an acceptable trade-off to achieve the
      high performance we have with big PAGE_SIZEs in some applications (e.g.
      RAID).
      
      Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
      the risk of stack overflows in the cases of on-stack arrays, which size
      depends on the page size (e.g. multipage BIOs, NTFS, etc.).
      
      With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
      to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
      occupied by PKMAP addresses leaving no place for vmalloc. We do not
      separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
      value of 10 in support for 16K/64K had been selected rather intuitively.
      Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
      one) we have 512 pages for PKMAP.
      
      Because ELF standard supports only page sizes up to 64K, then you should
      use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
      for building applications, which are to be run with the 256KB-page sized
      kernel. If using the older binutils, then you should patch them like follows:
      
      	--- binutils/bfd/elf32-ppc.c.orig
      	+++ binutils/bfd/elf32-ppc.c
      
      	-#define ELF_MAXPAGESIZE                0x10000
      	+#define ELF_MAXPAGESIZE                0x40000
      
      One more restriction we currently have with 256KB page sizes is inability
      to use shmem safely, so, for now, the 256KB is available only if you turn
      the CONFIG_SHMEM option off (another variant is to use BROKEN).
      Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
      dependency in 'config PPC_256K_PAGES', and use the workaround available here:
       http://lkml.org/lkml/2008/12/19/20Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
      Signed-off-by: NIlya Yanok <yanok@emcraft.com>
      Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
      e1240122
  25. 04 8月, 2008 1 次提交
  26. 28 7月, 2008 1 次提交
  27. 26 7月, 2008 1 次提交
  28. 16 5月, 2008 1 次提交
    • P
      [POWERPC] Defer processing of interrupts when the CPU wakes from sleep mode · a560643e
      Paul Mackerras 提交于
      This provides a way to defer processing of an interrupt that wakes the
      processor out of sleep mode.  On 32-bit platforms that use an
      interrupt to wake the processor, we have to have interrupts enabled in
      hardware at the point where we go to sleep, otherwise the processor
      will never wake up.  However, because interrupts are logically
      disabled at this point, we don't want to process the interrupt
      straight away.
      
      This is handled by setting the _TLF_SLEEPING flag.  When we get an
      interrupt and _TLF_SLEEPING is set, we firstly clear the MSR_EE
      (external interrupt enable) bit in the saved MSR value, and secondly
      we then return to the address in the link register, like we do for
      _TLF_NAPPING, but without actually handling the interrupt.
      
      Note that this is handled somewhat differently on powerbooks, so this
      new code will only be used on non-Apple machines.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      a560643e
  29. 14 5月, 2008 1 次提交
  30. 24 4月, 2008 1 次提交
  31. 01 8月, 2007 1 次提交
  32. 14 6月, 2007 1 次提交
  33. 14 12月, 2006 1 次提交
    • R
      [PATCH] PM: Fix SMP races in the freezer · 8a102eed
      Rafael J. Wysocki 提交于
      Currently, to tell a task that it should go to the refrigerator, we set the
      PF_FREEZE flag for it and send a fake signal to it.  Unfortunately there
      are two SMP-related problems with this approach.  First, a task running on
      another CPU may be updating its flags while the freezer attempts to set
      PF_FREEZE for it and this may leave the task's flags in an inconsistent
      state.  Second, there is a potential race between freeze_process() and
      refrigerator() in which freeze_process() running on one CPU is reading a
      task's PF_FREEZE flag while refrigerator() running on another CPU has just
      set PF_FROZEN for the same task and attempts to reset PF_FREEZE for it.  If
      the refrigerator wins the race, freeze_process() will state that PF_FREEZE
      hasn't been set for the task and will set it unnecessarily, so the task
      will go to the refrigerator once again after it's been thawed.
      
      To solve first of these problems we need to stop using PF_FREEZE to tell
      tasks that they should go to the refrigerator.  Instead, we can introduce a
      special TIF_*** flag and use it for this purpose, since it is allowed to
      change the other tasks' TIF_*** flags and there are special calls for it.
      
      To avoid the freeze_process()-refrigerator() race we can make
      freeze_process() to always check the task's PF_FROZEN flag after it's read
      its "freeze" flag.  We should also make sure that refrigerator() will
      always reset the task's "freeze" flag after it's set PF_FROZEN for it.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8a102eed
  34. 26 4月, 2006 1 次提交
  35. 18 4月, 2006 1 次提交
    • P
      powerpc: Use correct sequence for putting CPU into nap mode · f39224a8
      Paul Mackerras 提交于
      We weren't using the recommended sequence for putting the CPU into
      nap mode.  When I changed the idle loop, for some reason 7447A cpus
      started hanging when we put them into nap mode.  Changing to the
      recommended sequence fixes that.
      
      The complexity here is that the recommended sequence is a loop that
      keeps putting the cpu back into nap mode.  Clearly we need some way
      to break out of the loop when an interrupt (external interrupt,
      decrementer, performance monitor) occurs.  Here we use a bit in
      the thread_info struct to indicate that we need this, and the exception
      entry code notices this and arranges for the exception to return
      to the value in the link register, thus breaking out of the loop.
      We use a new `local_flags' field in the thread_info which we can
      alter without needing to use an atomic update sequence.
      
      The PPC970 has the same recommended sequence, so we do the same thing
      there too.
      
      This also fixes a bug in the kernel stack overflow handling code on
      32-bit, since it was causing a value that we needed in a register to
      get trashed.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f39224a8
  36. 08 3月, 2006 1 次提交
    • P
      powerpc: Fix various syscall/signal/swapcontext bugs · 1bd79336
      Paul Mackerras 提交于
      A careful reading of the recent changes to the system call entry/exit
      paths revealed several problems, plus some things that could be
      simplified and improved:
      
      * 32-bit wasn't testing the _TIF_NOERROR bit in the syscall fast exit
        path, so it was only doing anything with it once it saw some other
        bit being set.  In other words, the noerror behaviour would apply to
        the next system call where we had to reschedule or deliver a signal,
        which is not necessarily the current system call.
      
      * 32-bit wasn't doing the call to ptrace_notify in the syscall exit
        path when the _TIF_SINGLESTEP bit was set.
      
      * _TIF_RESTOREALL was in both _TIF_USER_WORK_MASK and
        _TIF_PERSYSCALL_MASK, which is odd since _TIF_RESTOREALL is only set
        by system calls.  I took it out of _TIF_USER_WORK_MASK.
      
      * On 64-bit, _TIF_RESTOREALL wasn't causing the non-volatile registers
        to be restored (unless perhaps a signal was delivered or the syscall
        was traced or single-stepped).  Thus the non-volatile registers
        weren't restored on exit from a signal handler.  We probably got
        away with it mostly because signal handlers written in C wouldn't
        alter the non-volatile registers.
      
      * On 32-bit I simplified the code and made it more like 64-bit by
        making the syscall exit path jump to ret_from_except to handle
        preemption and signal delivery.
      
      * 32-bit was calling do_signal unnecessarily when _TIF_RESTOREALL was
        set - but I think because of that 32-bit was actually restoring the
        non-volatile registers on exit from a signal handler.
      
      * I changed the order of enabling interrupts and saving the
        non-volatile registers before calling do_syscall_trace_leave; now we
        enable interrupts first.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      1bd79336
  37. 24 2月, 2006 1 次提交
    • A
      [PATCH] powerpc: Fix runlatch performance issues · cb2c9b27
      Anton Blanchard 提交于
      The runlatch SPR can take a lot of time to write. My original runlatch
      code would set it on every exception entry even though most of the time
      this was not required. It would also continually set it in the idle
      loop, which is an issue on an SMT capable processor.
      
      Now we cache the runlatch value in a threadinfo bit, and only check for
      it in decrementer and hardware interrupt exceptions as well as the idle
      loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      cb2c9b27
  38. 08 2月, 2006 1 次提交