1. 13 1月, 2018 1 次提交
    • E
      signal/powerpc: Document conflicts with SI_USER and SIGFPE and SIGTRAP · cf4674c4
      Eric W. Biederman 提交于
      Setting si_code to 0 results in a userspace seeing an si_code of 0.
      This is the same si_code as SI_USER.  Posix and common sense requires
      that SI_USER not be a signal specific si_code.  As such this use of 0
      for the si_code is a pretty horribly broken ABI.
      
      Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
      value of __SI_KILL and now sees a value of SIL_KILL with the result
      that uid and pid fields are copied and which might copying the si_addr
      field by accident but certainly not by design.  Making this a very
      flakey implementation.
      
      Utilizing FPE_FIXME and TRAP_FIXME, siginfo_layout() will now return
      SIL_FAULT and the appropriate fields will be reliably copied.
      
      Possible ABI fixes includee:
      - Send the signal without siginfo
      - Don't generate a signal
      - Possibly assign and use an appropriate si_code
      - Don't handle cases which can't happen
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Kumar Gala <kumar.gala@freescale.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc:  linuxppc-dev@lists.ozlabs.org
      Ref: 9bad068c24d7 ("[PATCH] ppc32: support for e500 and 85xx")
      Ref: 0ed70f6105ef ("PPC32: Provide proper siginfo information on various exceptions.")
      History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.gitSigned-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      cf4674c4
  2. 06 11月, 2017 5 次提交
    • C
      powerpc: Remove facility loadups on transactional {fp, vec, vsx} unavailable · 6f700d38
      Cyril Bur 提交于
      After handling a transactional FP, Altivec or VSX unavailable exception.
      The return to userspace code will detect that the TIF_RESTORE_TM bit is
      set and call restore_tm_state(). restore_tm_state() will call
      restore_math() to ensure that the correct facilities are loaded.
      
      This means that all the loadup code in {fp,altivec,vsx}_unavailable_tm()
      is doing pointless work and can simply be removed.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6f700d38
    • C
      powerpc: Always save/restore checkpointed regs during treclaim/trecheckpoint · eb5c3f1c
      Cyril Bur 提交于
      Lazy save and restore of FP/Altivec means that a userspace process can
      be sent to userspace with FP or Altivec disabled and loaded only as
      required (by way of an FP/Altivec unavailable exception). Transactional
      Memory complicates this situation as a transaction could be started
      without FP/Altivec being loaded up. This causes the hardware to
      checkpoint incorrect registers. Handling FP/Altivec unavailable
      exceptions while a thread is transactional requires a reclaim and
      recheckpoint to ensure the CPU has correct state for both sets of
      registers.
      
      tm_reclaim() has optimisations to not always save the FP/Altivec
      registers to the checkpointed save area. This was originally done
      because the caller might have information that the checkpointed
      registers aren't valid due to lazy save and restore. We've also been a
      little vague as to how tm_reclaim() leaves the FP/Altivec state since it
      doesn't necessarily always save it to the thread struct. This has lead
      to an (incorrect) assumption that it leaves the checkpointed state on
      the CPU.
      
      tm_recheckpoint() has similar optimisations in reverse. It may not
      always reload the checkpointed FP/Altivec registers from the thread
      struct before the trecheckpoint. It is therefore quite unclear where it
      expects to get the state from. This didn't help with the assumption
      made about tm_reclaim().
      
      These optimisations sit in what is by definition a slow path. If a
      process has to go through a reclaim/recheckpoint then its transaction
      will be doomed on returning to userspace. This mean that the process
      will be unable to complete its transaction and be forced to its failure
      handler. This is already an out if line case for userspace. Furthermore,
      the cost of copying 64 times 128 bits from registers isn't very long[0]
      (at all) on modern processors. As such it appears these optimisations
      have only served to increase code complexity and are unlikely to have
      had a measurable performance impact.
      
      Our transactional memory handling has been riddled with bugs. A cause
      of this has been difficulty in following the code flow, code complexity
      has not been our friend here. It makes sense to remove these
      optimisations in favour of a (hopefully) more stable implementation.
      
      This patch does mean that some times the assembly will needlessly save
      'junk' registers which will subsequently get overwritten with the
      correct value by the C code which calls the assembly function. This
      small inefficiency is far outweighed by the reduction in complexity for
      general TM code, context switching paths, and transactional facility
      unavailable exception handler.
      
      0: I tried to measure it once for other work and found that it was
      hiding in the noise of everything else I was working with. I find it
      exceedingly likely this will be the case here.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      eb5c3f1c
    • C
      powerpc: Force reload for recheckpoint during tm {fp, vec, vsx} unavailable exception · 91381b9c
      Cyril Bur 提交于
      Lazy save and restore of FP/Altivec means that a userspace process can
      be sent to userspace with FP or Altivec disabled and loaded only as
      required (by way of an FP/Altivec unavailable exception). Transactional
      Memory complicates this situation as a transaction could be started
      without FP/Altivec being loaded up. This causes the hardware to
      checkpoint incorrect registers. Handling FP/Altivec unavailable
      exceptions while a thread is transactional requires a reclaim and
      recheckpoint to ensure the CPU has correct state for both sets of
      registers.
      
      tm_reclaim() has optimisations to not always save the FP/Altivec
      registers to the checkpointed save area. This was originally done
      because the caller might have information that the checkpointed
      registers aren't valid due to lazy save and restore. We've also been a
      little vague as to how tm_reclaim() leaves the FP/Altivec state since it
      doesn't necessarily always save it to the thread struct. This has lead
      to an (incorrect) assumption that it leaves the checkpointed state on
      the CPU.
      
      tm_recheckpoint() has similar optimisations in reverse. It may not
      always reload the checkpointed FP/Altivec registers from the thread
      struct before the trecheckpoint. It is therefore quite unclear where it
      expects to get the state from. This didn't help with the assumption
      made about tm_reclaim().
      
      This patch is a minimal fix for ease of backporting. A more correct fix
      which removes the msr parameter to tm_reclaim() and tm_recheckpoint()
      altogether has been upstreamed to apply on top of this patch.
      
      Fixes: dc310669 ("powerpc: tm: Always use fp_state and vr_state to
      store live registers")
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      91381b9c
    • C
      powerpc: Don't enable FP/Altivec if not checkpointed · a7771176
      Cyril Bur 提交于
      Lazy save and restore of FP/Altivec means that a userspace process can
      be sent to userspace with FP or Altivec disabled and loaded only as
      required (by way of an FP/Altivec unavailable exception). Transactional
      Memory complicates this situation as a transaction could be started
      without FP/Altivec being loaded up. This causes the hardware to
      checkpoint incorrect registers. Handling FP/Altivec unavailable
      exceptions while a thread is transactional requires a reclaim and
      recheckpoint to ensure the CPU has correct state for both sets of
      registers.
      
      Lazy save and restore of FP/Altivec cannot be done if a process is
      transactional. If a facility was enabled it must remain enabled whenever
      a thread is transactional.
      
      Commit dc16b553 ("powerpc: Always restore FPU/VEC/VSX if hardware
      transactional memory in use") ensures that the facilities are always
      enabled if a thread is transactional. A bug in the introduced code may
      cause it to inadvertently enable a facility that was (and should remain)
      disabled. The problem with this extraneous enablement is that the
      registers for the erroneously enabled facility have not been correctly
      recheckpointed - the recheckpointing code assumed the facility would
      remain disabled.
      
      Further compounding the issue, the transactional {fp,altivec,vsx}
      unavailable code has been incorrectly using the MSR to enable
      facilities. The presence of the {FP,VEC,VSX} bit in the regs->msr simply
      means if the registers are live on the CPU, not if the kernel should
      load them before returning to userspace. This has worked due to the bug
      mentioned above.
      
      This causes transactional threads which return to their failure handler
      to observe incorrect checkpointed registers. Perhaps an example will
      help illustrate the problem:
      
      A userspace process is running and uses both FP and Altivec registers.
      This process then continues to run for some time without touching
      either sets of registers. The kernel subsequently disables the
      facilities as part of lazy save and restore. The userspace process then
      performs a tbegin and the CPU checkpoints 'junk' FP and Altivec
      registers. The process then performs a floating point instruction
      triggering a fp unavailable exception in the kernel.
      
      The kernel then loads the FP registers - and only the FP registers.
      Since the thread is transactional it must perform a reclaim and
      recheckpoint to ensure both the checkpointed registers and the
      transactional registers are correct. It then (correctly) enables
      MSR[FP] for the process. Later (on exception exist) the kernel also
      (inadvertently) enables MSR[VEC]. The process is then returned to
      userspace.
      
      Since the act of loading the FP registers doomed the transaction we know
      CPU will fail the transaction, restore its checkpointed registers, and
      return the process to its failure handler. The problem is that we're
      now running with Altivec enabled and the 'junk' checkpointed registers
      are restored. The kernel had only recheckpointed FP.
      
      This patch solves this by only activating FP/Altivec if userspace was
      using them when it entered the kernel and not simply if the process is
      transactional.
      
      Fixes: dc16b553 ("powerpc: Always restore FPU/VEC/VSX if hardware
      transactional memory in use")
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a7771176
    • M
      powerpc/tm: Don't check for WARN in TM Bad Thing handling · 632f0574
      Michael Ellerman 提交于
      Currently when we take a TM Bad Thing program check exception, we
      search the bug table to see if the program check was generated by a
      WARN/WARN_ON etc.
      
      That makes no sense, the WARN macros use trap instructions, which
      should never generate a TM Bad Thing exception. If they ever did that
      would be a bug and we should oops.
      
      We do have some hand-coded bugs in tm.S, using EMIT_BUG_ENTRY, but
      those are all BUGs not WARNs, and they all use trap instructions
      anyway. Almost certainly this check was incorrectly copied from the
      REASON_TRAP handling in the same function.
      
      Remove it.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-By: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      632f0574
  3. 27 9月, 2017 1 次提交
    • M
      powerpc/64s: Add workaround for P9 vector CI load issue · 5080332c
      Michael Neuling 提交于
      POWER9 DD2.1 and earlier has an issue where some cache inhibited
      vector load will return bad data. The workaround is two part, one
      firmware/microcode part triggers HMI interrupts when hitting such
      loads, the other part is this patch which then emulates the
      instructions in Linux.
      
      The affected instructions are limited to lxvd2x, lxvw4x, lxvb16x and
      lxvh8x.
      
      When an instruction triggers the HMI, all threads in the core will be
      sent to the HMI handler, not just the one running the vector load.
      
      In general, these spurious HMIs are detected by the emulation code and
      we just return back to the running process. Unfortunately, if a
      spurious interrupt occurs on a vector load that's to normal memory we
      have no way to detect that it's spurious (unless we walk the page
      tables, which is very expensive). In this case we emulate the load but
      we need do so using a vector load itself to ensure 128bit atomicity is
      preserved.
      
      Some additional debugfs emulated instruction counters are added also.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [mpe: Switch CONFIG_PPC_BOOK3S_64 to CONFIG_VSX to unbreak the build]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5080332c
  4. 31 8月, 2017 3 次提交
    • N
      powerpc: Machine check interrupt is a non-maskable interrupt · b96672dd
      Nicholas Piggin 提交于
      Use nmi_enter similarly to system reset interrupts. This uses NMI
      printk NMI buffers and turns off various debugging facilities that
      helps avoid tripping on ourselves or other CPUs.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b96672dd
    • N
      powerpc/powernv: Use kernel crash path for machine checks · 6fcd6baa
      Nicholas Piggin 提交于
      There are quite a few machine check exceptions that can be caused by
      kernel bugs. To make debugging easier, use the kernel crash path in
      cases of synchronous machine checks that occur in kernel mode, if that
      would not result in the machine going straight to panic or crash dump.
      
      There is a downside here that die()ing the process in kernel mode can
      still leave the system unstable. panic_on_oops will always force the
      system to fail-stop, so systems where that behaviour is important will
      still do the right thing.
      
      As a test, when triggering an i-side 0111b error (ifetch from foreign
      address) in kernel mode process context on POWER9, the kernel currently
      dies quickly like this:
      
        Severe Machine check interrupt [Not recovered]
          NIP [ffff000000000000]: 0xffff000000000000
          Initiator: CPU
          Error type: Real address [Instruction fetch (foreign)]
        [  127.426651616,0] OPAL: Reboot requested due to Platform error.
            Effective[  127.426693712,3] OPAL: Reboot requested due to Platform error. address: ffff000000000000
        opal: Reboot type 1 not supported
        Kernel panic - not syncing: PowerNV Unrecovered Machine Check
        CPU: 56 PID: 4425 Comm: syscall Tainted: G   M            4.12.0-rc1-13857-ga4700a26-dirty #35
        Call Trace:
        [  128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to
          buggy/mising code in OPAL for this BMC
          Rebooting in 10 seconds..
        Trying to free IRQ 496 from IRQ context!
      
      After this patch, the process is killed and the kernel continues with
      this message, which gives enough information to identify the offending
      branch (i.e., with CFAR):
      
        Severe Machine check interrupt [Not recovered]
          NIP [ffff000000000000]: 0xffff000000000000
          Initiator: CPU
          Error type: Real address [Instruction fetch (foreign)]
            Effective address: ffff000000000000
        Oops: Machine check, sig: 7 [#1]
        SMP NR_CPUS=2048
        NUMA
        PowerNV
        Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 ...
        CPU: 22 PID: 4436 Comm: syscall Tainted: G   M            4.12.0-rc1-13857-ga4700a26-dirty #36
        task: c000000932300000 task.stack: c000000932380000
        NIP: ffff000000000000 LR: 00000000217706a4 CTR: ffff000000000000
        REGS: c00000000fc8fd80 TRAP: 0200   Tainted: G   M             (4.12.0-rc1-13857-ga4700a26-dirty)
        MSR: 90000000001c1003 <SF,HV,ME,RI,LE>
          CR: 24000484  XER: 20000000
        CFAR: c000000000004c80 DAR: 0000000021770a90 DSISR: 0a000000 SOFTE: 1
        GPR00: 0000000000001ebe 00007fffce4818b0 0000000021797f00 0000000000000000
        GPR04: 00007fff8007ac24 0000000044000484 0000000000004000 00007fff801405e8
        GPR08: 900000000280f033 0000000024000484 0000000000000000 0000000000000030
        GPR12: 9000000000001003 00007fff801bc370 0000000000000000 0000000000000000
        GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR28: 00007fff801b0000 0000000000000000 00000000217707a0 00007fffce481918
        NIP [ffff000000000000] 0xffff000000000000
        LR [00000000217706a4] 0x217706a4
        Call Trace:
        Instruction dump:
        XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
        XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6fcd6baa
    • N
      powerpc: Do not send system reset request through the oops path · 4388c9b3
      Nicholas Piggin 提交于
      A system reset is a request to crash / debug the system rather than
      necessarily caused by encountering a BUG. So there is no need to
      serialize all CPUs behind the die lock, adding taints to all
      subsequent traces beyond the first, breaking console locks, etc.
      
      The system reset is NMI context which has its own printk buffers to
      prevent output being interleaved. Then it's better to have all
      secondaries print out their debug as quickly as possible and the
      primary will flush out all printk buffers during panic().
      
      So remove the 0x100 path from die, and move it into system_reset. Name
      the crash/dump reasons "System Reset".
      
      This gives "not tained" traces when crashing an untainted kernel. It
      also gives the panic reason as "System Reset" as opposed to "Fatal
      exception in interrupt" (or "die oops" for fadump).
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4388c9b3
  5. 29 8月, 2017 1 次提交
  6. 28 8月, 2017 3 次提交
  7. 10 8月, 2017 10 次提交
  8. 03 7月, 2017 1 次提交
  9. 03 5月, 2017 1 次提交
    • M
      powerpc/book3s/mce: Move add_taint() later in virtual mode · d93b0ac0
      Mahesh Salgaonkar 提交于
      machine_check_early() gets called in real mode. The very first time when
      add_taint() is called, it prints a warning which ends up calling opal
      call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
      very first machine check while we are in opal we are doomed. OPAL_CALL
      overwrites the PACASAVEDMSR in r13 and in this case when we are done with
      MCE handling the original opal call will use this new MSR on it's way
      back to opal_return. This usually leads to unexpected behaviour or the
      kernel to panic. Instead move the add_taint() call later in the virtual
      mode where it is safe to call.
      
      This is broken with current FW level. We got lucky so far for not getting
      very first MCE hit while in OPAL. But easily reproducible on Mambo.
      
      Fixes: 27ea2c42 ("powerpc: Set the correct kernel taint on machine check errors.")
      Cc: stable@vger.kernel.org # v4.2+
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d93b0ac0
  10. 28 4月, 2017 2 次提交
  11. 13 4月, 2017 2 次提交
  12. 11 4月, 2017 1 次提交
  13. 02 3月, 2017 1 次提交
  14. 25 12月, 2016 1 次提交
  15. 02 12月, 2016 1 次提交
    • B
      powerpc Don't print misleading facility name in facility unavailable exception · 93c2ec0f
      Balbir Singh 提交于
      The current facility_strings[] are correct when the trap address is
      0xf80 (hypervisor facility unavailable). When the trap address is
      0xf60 (facility unavailable) IC (Interruption Cause) a.k.a status in the
      code is undefined for values 0 and 1.
      
      Add a check to prevent printing the (misleading) facility name for IC 0
      and 1 when we came in via 0xf60. In all cases, print the actual IC
      value, to avoid any confusion.
      
      This hasn't been seen on real hardware, on only qemu which was
      misreporting an exception.
      Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
      [mpe: Fix indentation, combine printks(), massage change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      93c2ec0f
  16. 30 11月, 2016 1 次提交
  17. 23 11月, 2016 1 次提交
  18. 18 11月, 2016 2 次提交
  19. 14 11月, 2016 2 次提交