1. 03 7月, 2017 5 次提交
  2. 15 6月, 2017 5 次提交
  3. 27 4月, 2017 1 次提交
  4. 24 4月, 2017 1 次提交
  5. 23 4月, 2017 1 次提交
  6. 18 4月, 2017 1 次提交
    • R
      powerpc/kprobe: Fix oops when kprobed on 'stdu' instruction · 9e1ba4f2
      Ravi Bangoria 提交于
      If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
      OOPS:
      
        Bad kernel stack pointer cd93c840 at c000000000009868
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        ...
        GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
        ...
        NIP [c000000000009868] resume_kernel+0x2c/0x58
        LR [c000000000006208] program_check_common+0x108/0x180
      
      On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
      not emulate actual store in emulate_step() because it may corrupt the exception
      frame. So the kernel does the actual store operation in exception return code
      i.e. resume_kernel().
      
      resume_kernel() loads the saved stack pointer from memory using lwz, which only
      loads the low 32-bits of the address, causing the kernel crash.
      
      Fix this by loading the 64-bit value instead.
      
      Fixes: be96f633 ("powerpc: Split out instruction analysis part of emulate_step()")
      Cc: stable@vger.kernel.org # v3.18+
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Reviewed-by: NAnanth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      [mpe: Change log massage, add stable tag]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9e1ba4f2
  7. 20 9月, 2016 1 次提交
  8. 29 8月, 2016 1 次提交
  9. 08 8月, 2016 1 次提交
  10. 01 8月, 2016 1 次提交
  11. 09 7月, 2016 1 次提交
  12. 14 6月, 2016 1 次提交
    • M
      powerpc: Define and use PPC64_ELF_ABI_v2/v1 · f55d9665
      Michael Ellerman 提交于
      We're approaching 20 locations where we need to check for ELF ABI v2.
      That's fine, except the logic is a bit awkward, because we have to check
      that _CALL_ELF is defined and then what its value is.
      
      So check it once in asm/types.h and define PPC64_ELF_ABI_v2 when ELF ABI
      v2 is detected.
      
      We also have a few places where what we're really trying to check is
      that we are using the 64-bit v1 ABI, ie. function descriptors. So also
      add a #define for that, which simplifies several checks.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f55d9665
  13. 11 5月, 2016 1 次提交
  14. 27 4月, 2016 1 次提交
    • C
      powerpc: Add support for userspace P9 copy paste · 8a649045
      Chris Smart 提交于
      The copy paste facility introduced in POWER9 provides an optimised
      mechanism for a userspace application to copy a cacheline. This is
      provided by a pair of instructions, copy and paste, while a third,
      cp_abort (copy paste abort), provides a clean up of the state in case of
      a failure.
      
      The copy instruction will read a 128 byte cacheline and store it in an
      internal buffer. The subsequent paste instruction will store this
      internal buffer to memory and set a CR field if the paste succeeds.
      
      Since the state of the copy paste buffer is internal (and not
      architecturally visible), in the unlikely event of a context switch, the
      state cannot be stored and the paste should therefore fail.
      
      The cp_abort instruction exists to fail and clean up any such
      interrupted copy paste sequence and is to be called by the kernel as
      part of the context switch. Doing so prevents data from a preceding copy
      in one process leaking into the paste of another.
      
      This code enables use of the cp_abort instruction if a supported
      processor is detected.
      
      NOTE: this is for userspace only, not in kernel, and does not deal
      with KVM guests.
      
      Patch created with much assistance from Michael Neuling
      <mikey@neuling.org>
      Signed-off-by: NChris Smart <chris@distroguy.com>
      Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8a649045
  15. 14 4月, 2016 1 次提交
    • M
      powerpc/livepatch: Add live patching support on ppc64le · 85baa095
      Michael Ellerman 提交于
      Add the kconfig logic & assembly support for handling live patched
      functions. This depends on DYNAMIC_FTRACE_WITH_REGS, which in turn
      depends on the new -mprofile-kernel ftrace ABI, which is only supported
      currently on ppc64le.
      
      Live patching is handled by a special ftrace handler. This means it runs
      from ftrace_caller(). The live patch handler modifies the NIP so as to
      redirect the return from ftrace_caller() to the new patched function.
      
      However there is one particularly tricky case we need to handle.
      
      If a function A calls another function B, and it is known at link time
      that they share the same TOC, then A will not save or restore its TOC,
      and will call the local entry point of B.
      
      When we live patch B, we replace it with a new function C, which may
      not have the same TOC as A. At live patch time it's too late to modify A
      to do the TOC save/restore, so the live patching code must interpose
      itself between A and C, and do the TOC save/restore that A omitted.
      
      An additionaly complication is that the livepatch code can not create a
      stack frame in order to save the TOC. That is because if C takes > 8
      arguments, or is varargs, A will have written the arguments for C in
      A's stack frame.
      
      To solve this, we introduce a "livepatch stack" which grows upward from
      the base of the regular stack, and is used to store the TOC & LR when
      calling a live patched function.
      
      When the patched function returns, we retrieve the real LR & TOC from
      the livepatch stack, restore them, and pop the livepatch "stack frame".
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NTorsten Duwe <duwe@suse.de>
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      85baa095
  16. 16 3月, 2016 1 次提交
    • C
      powerpc: Fix unrecoverable SLB miss during restore_math() · 6e669f08
      Cyril Bur 提交于
      Commit 70fe3d98 "powerpc: Restore FPU/VEC/VSX if previously used" introduces a
      call to restore_math() late in the syscall return path, after MSR_RI has been
      cleared. The MSR_RI flag is used to indicate whether the kernel can take
      another exception or not. A cleared MSR_RI flag indicates that the kernel
      cannot.
      
      Unfortunately when a machine is under SLB pressure an SLB miss can occur
      in restore_math() which (with MSR_RI cleared) leads to an unrecoverable
      exception.
      
        Unrecoverable exception 4100 at c0000000000088d8
        cpu 0x0: Vector: 4100  at [c0000003fa473b20]
            pc: c0000000000088d8: .load_vr_state+0x70/0x110
            lr: c00000000000f710: .restore_math+0x130/0x188
            sp: c0000003fa473da0
           msr: 9000000002003030
          current = 0xc0000007f876f180
          paca    = 0xc00000000fff0000	 softe: 0	 irq_happened: 0x01
            pid   = 1944, comm = K08umountfs
        [link register   ] c00000000000f710 .restore_math+0x130/0x188
        [c0000003fa473da0] c0000003fa473e30 (unreliable)
        [c0000003fa473e30] c000000000007b6c system_call+0x84/0xfc
      
      The clearing of MSR_RI is actually an optimisation to avoid multiple MSR
      writes, what must be disabled are interrupts. See comment in entry_64.S:
      
        /*
         * For performance reasons we clear RI the same time that we
         * clear EE. We only need to clear RI just before we restore r13
         * below, but batching it with EE saves us one expensive mtmsrd call.
         * We have to be careful to restore RI if we branch anywhere from
         * here (eg syscall_exit_work).
         */
      
      At the point of calling restore_math() r13 has not been restored, as such, the
      quick fix of turning MSR_RI back on for the call to restore_math() will
      eliminate the occurrence of an unrecoverable exception.
      
      We'd like to do a better fix in future.
      
      Fixes: 70fe3d98 ("powerpc: Restore FPU/VEC/VSX if previously used")
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6e669f08
  17. 07 3月, 2016 1 次提交
    • T
      powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI · 15308664
      Torsten Duwe 提交于
      The gcc switch -mprofile-kernel defines a new ABI for calling _mcount()
      very early in the function with minimal overhead.
      
      Although mprofile-kernel has been available since GCC 3.4, there were
      bugs which were only fixed recently. Currently it is known to work in
      GCC 4.9, 5 and 6.
      
      Additionally there are two possible code sequences generated by the
      flag, the first uses mflr/std/bl and the second is optimised to omit the
      std. Currently only gcc 6 has the optimised sequence. This patch
      supports both sequences.
      
      Initial work started by Vojtech Pavlik, used with permission.
      
      Key changes:
       - rework _mcount() to work for both the old and new ABIs.
       - implement new versions of ftrace_caller() and ftrace_graph_caller()
         which deal with the new ABI.
       - updates to __ftrace_make_nop() to recognise the new mcount calling
         sequence.
       - updates to __ftrace_make_call() to recognise the nop'ed sequence.
       - implement ftrace_modify_call().
       - updates to the module loader to surpress the toc save in the module
         stub when calling mcount with the new ABI.
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NTorsten Duwe <duwe@suse.de>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      15308664
  18. 02 3月, 2016 1 次提交
    • C
      powerpc: Restore FPU/VEC/VSX if previously used · 70fe3d98
      Cyril Bur 提交于
      Currently the FPU, VEC and VSX facilities are lazily loaded. This is not
      a problem unless a process is using these facilities.
      
      Modern versions of GCC are very good at automatically vectorising code,
      new and modernised workloads make use of floating point and vector
      facilities, even the kernel makes use of vectorised memcpy.
      
      All this combined greatly increases the cost of a syscall since the
      kernel uses the facilities sometimes even in syscall fast-path making it
      increasingly common for a thread to take an *_unavailable exception soon
      after a syscall, not to mention potentially taking all three.
      
      The obvious overcompensation to this problem is to simply always load
      all the facilities on every exit to userspace. Loading up all FPU, VEC
      and VSX registers every time can be expensive and if a workload does
      avoid using them, it should not be forced to incur this penalty.
      
      An 8bit counter is used to detect if the registers have been used in the
      past and the registers are always loaded until the value wraps to back
      to zero.
      
      Several versions of the assembly in entry_64.S were tested:
      
        1. Always calling C.
        2. Performing a common case check and then calling C.
        3. A complex check in asm.
      
      After some benchmarking it was determined that avoiding C in the common
      case is a performance benefit (option 2). The full check in asm (option
      3) greatly complicated that codepath for a negligible performance gain
      and the trade-off was deemed not worth it.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      [mpe: Move load_vec in the struct to fill an existing hole, reword change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      
      fixup
      70fe3d98
  19. 17 12月, 2015 2 次提交
  20. 01 12月, 2015 3 次提交
  21. 29 7月, 2015 2 次提交
    • M
      powerpc/kernel: Change the do_syscall_trace_enter() API · d3837414
      Michael Ellerman 提交于
      The API for calling do_syscall_trace_enter() is currently sensible
      enough, it just returns the (modified) syscall number.
      
      However once we enable seccomp filter it will get more complicated. When
      seccomp filter runs, the seccomp kernel code (via SECCOMP_RET_ERRNO), or
      a ptracer (via SECCOMP_RET_TRACE), may reject the syscall and *may* or may
      *not* set a return value in r3.
      
      That means the assembler that calls do_syscall_trace_enter() can not
      blindly return ENOSYS, it needs to only return ENOSYS if a return value
      has not already been set.
      
      There is no way to implement that logic with the current API. So change
      the do_syscall_trace_enter() API to make it deal with the return code
      juggling, and the assembler can then just return whatever return code it
      is given.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      d3837414
    • M
      powerpc/kernel: Switch to using MAX_ERRNO · c3525940
      Michael Ellerman 提交于
      Currently on powerpc we have our own #define for the highest (negative)
      errno value, called _LAST_ERRNO. This is defined to be 516, for reasons
      which are not clear.
      
      The generic code, and x86, use MAX_ERRNO, which is defined to be 4095.
      
      In particular seccomp uses MAX_ERRNO to restrict the value that a
      seccomp filter can return.
      
      Currently with the mismatch between _LAST_ERRNO and MAX_ERRNO, a seccomp
      tracer wanting to return 600, expecting it to be seen as an error, would
      instead find on powerpc that userspace sees a successful syscall with a
      return value of 600.
      
      To avoid this inconsistency, switch powerpc to use MAX_ERRNO.
      
      We are somewhat confident that generic syscalls that can return a
      non-error value above negative MAX_ERRNO have already been updated to
      use force_successful_syscall_return().
      
      I have also checked all the powerpc specific syscalls, and believe that
      none of them expect to return a non-error value between -MAX_ERRNO and
      -516. So this change should be safe ...
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      c3525940
  22. 19 6月, 2015 1 次提交
    • S
      powerpc/tm: Abort syscalls in active transactions · b4b56f9e
      Sam bobroff 提交于
      This patch changes the syscall handler to doom (tabort) active
      transactions when a syscall is made and return very early without
      performing the syscall and keeping side effects to a minimum (no CPU
      accounting or system call tracing is performed). Also included is a
      new HWCAP2 bit, PPC_FEATURE2_HTM_NOSC, to indicate this
      behaviour to userspace.
      
      Currently, the system call instruction automatically suspends an
      active transaction which causes side effects to persist when an active
      transaction fails.
      
      This does change the kernel's behaviour, but in a way that was
      documented as unsupported.  It doesn't reduce functionality as
      syscalls will still be performed after tsuspend; it just requires that
      the transaction be explicitly suspended.  It also provides a
      consistent interface and makes the behaviour of user code
      substantially the same across powerpc and platforms that do not
      support suspended transactions (e.g. x86 and s390).
      
      Performance measurements using
      http://ozlabs.org/~anton/junkcode/null_syscall.c indicate the cost of
      a normal (non-aborted) system call increases by about 0.25%.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b4b56f9e
  23. 07 6月, 2015 1 次提交
  24. 30 4月, 2015 1 次提交
    • M
      Revert "powerpc/tm: Abort syscalls in active transactions" · 68fc378c
      Michael Ellerman 提交于
      This reverts commit feba4036.
      
      Although the principle of this change is good, the implementation has a
      few issues.
      
      Firstly we can sometimes fail to abort a syscall because r12 may have
      been clobbered by C code if we went down the virtual CPU accounting
      path, or if syscall tracing was enabled.
      
      Secondly we have decided that it is safer to abort the syscall even
      earlier in the syscall entry path, so that we avoid the syscall tracing
      path when we are transactional.
      
      So that we have time to thoroughly test those changes we have decided to
      revert this for this merge window and will merge the fixed version in
      the next window.
      
      NB. Rather than reverting the selftest we just drop tm-syscall from
      TEST_PROGS so that it's not run by default.
      
      Fixes: feba4036 ("powerpc/tm: Abort syscalls in active transactions")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      68fc378c
  25. 11 4月, 2015 1 次提交
    • S
      powerpc/tm: Abort syscalls in active transactions · feba4036
      Sam bobroff 提交于
      This patch changes the syscall handler to doom (tabort) active
      transactions when a syscall is made and return immediately without
      performing the syscall.
      
      Currently, the system call instruction automatically suspends an
      active transaction which causes side effects to persist when an active
      transaction fails.
      
      This does change the kernel's behaviour, but in a way that was
      documented as unsupported. It doesn't reduce functionality because
      syscalls will still be performed after tsuspend. It also provides a
      consistent interface and makes the behaviour of user code
      substantially the same across powerpc and platforms that do not
      support suspended transactions (e.g. x86 and s390).
      
      Performance measurements using
      http://ozlabs.org/~anton/junkcode/null_syscall.c
      indicate the cost of a system call increases by about 0.5%.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Acked-By: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      feba4036
  26. 28 3月, 2015 1 次提交
    • M
      powerpc: Add a proper syscall for switching endianness · 529d235a
      Michael Ellerman 提交于
      We currently have a "special" syscall for switching endianness. This is
      syscall number 0x1ebe, which is handled explicitly in the 64-bit syscall
      exception entry.
      
      That has a few problems, firstly the syscall number is outside of the
      usual range, which confuses various tools. For example strace doesn't
      recognise the syscall at all.
      
      Secondly it's handled explicitly as a special case in the syscall
      exception entry, which is complicated enough without it.
      
      As a first step toward removing the special syscall, we need to add a
      regular syscall that implements the same functionality.
      
      The logic is simple, it simply toggles the MSR_LE bit in the userspace
      MSR. This is the same as the special syscall, with the caveat that the
      special syscall clobbers fewer registers.
      
      This version clobbers r9-r12, XER, CTR, and CR0-1,5-7.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      529d235a
  27. 02 2月, 2015 2 次提交
    • M
      powerpc: Remove old compile time disabled syscall tracing code · a4bcbe6a
      Michael Ellerman 提交于
      We have code to do syscall tracing which is disabled at compile time by
      default. It's not been touched since the dawn of time (ie. v2.6.12).
      
      There are now better ways to do syscall tracing, ie. using the
      raw_syscall, or syscall tracepoints.
      
      For the specific case of tracing syscalls at boot on a system that
      doesn't get to userspace, you can boot with:
      
        trace_event=syscalls tp_printk=on
      
      Which will trace syscalls from boot, and echo all output to the console.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a4bcbe6a
    • M
      powerpc/kernel: Make syscall_exit a local label · 4c3b2168
      Michael Ellerman 提交于
      Currently when we back trace something that is in a syscall we see
      something like this:
      
      [c000000000000000] [c000000000000000] SyS_read+0x6c/0x110
      [c000000000000000] [c000000000000000] syscall_exit+0x0/0x98
      
      Although it's entirely correct, seeing syscall_exit at the bottom can be
      confusing - we were exiting from a syscall and then called SyS_read() ?
      
      If we instead change syscall_exit to be a local label we get something
      more intuitive:
      
      [c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110
      [c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0
      
      ie. we were handling a system call, and it was SyS_read().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4c3b2168