1. 04 10月, 2016 1 次提交
    • C
      powerpc: tm: Always use fp_state and vr_state to store live registers · dc310669
      Cyril Bur 提交于
      There is currently an inconsistency as to how the entire CPU register
      state is saved and restored when a thread uses transactional memory
      (TM).
      
      Using transactional memory results in the CPU having duplicated
      (almost) all of its register state. This duplication results in a set
      of registers which can be considered 'live', those being currently
      modified by the instructions being executed and another set that is
      frozen at a point in time.
      
      On context switch, both sets of state have to be saved and (later)
      restored. These two states are often called a variety of different
      things. Common terms for the state which only exists after the CPU has
      entered a transaction (performed a TBEGIN instruction) in hardware are
      'transactional' or 'speculative'.
      
      Between a TBEGIN and a TEND or TABORT (or an event that causes the
      hardware to abort), regardless of the use of TSUSPEND the
      transactional state can be referred to as the live state.
      
      The second state is often to referred to as the 'checkpointed' state
      and is a duplication of the live state when the TBEGIN instruction is
      executed. This state is kept in the hardware and will be rolled back
      to on transaction failure.
      
      Currently all the registers stored in pt_regs are ALWAYS the live
      registers, that is, when a thread has transactional registers their
      values are stored in pt_regs and the checkpointed state is in
      ckpt_regs. A strange opposite is true for fp_state/vr_state. When a
      thread is non transactional fp_state/vr_state holds the live
      registers. When a thread has initiated a transaction fp_state/vr_state
      holds the checkpointed state and transact_fp/transact_vr become the
      structure which holds the live state (at this point it is a
      transactional state).
      
      This method creates confusion as to where the live state is, in some
      circumstances it requires extra work to determine where to put the
      live state and prevents the use of common functions designed (probably
      before TM) to save the live state.
      
      With this patch pt_regs, fp_state and vr_state all represent the
      same thing and the other structures [pending rename] are for
      checkpointed state.
      Acked-by: NSimon Guo <wei.guo.simon@gmail.com>
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dc310669
  2. 15 7月, 2016 1 次提交
    • M
      powerpc/tm: Fix stack pointer corruption in __tm_recheckpoint() · 6bcb8014
      Michael Neuling 提交于
      At the start of __tm_recheckpoint() we save the kernel stack pointer
      (r1) in SPRG SCRATCH0 (SPRG2) so that we can restore it after the
      trecheckpoint.
      
      Unfortunately, the same SPRG is used in the SLB miss handler.  If an
      SLB miss is taken between the save and restore of r1 to the SPRG, the
      SPRG is changed and hence r1 is also corrupted.  We can end up with
      the following crash when we start using r1 again after the restore
      from the SPRG:
      
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        CPU: 658 PID: 143777 Comm: htm_demo Tainted: G            EL   X 4.4.13-0-default #1
        task: c0000b56993a7810 ti: c00000000cfec000 task.ti: c0000b56993bc000
        NIP: c00000000004f188 LR: 00000000100040b8 CTR: 0000000010002570
        REGS: c00000000cfefd40 TRAP: 0300   Tainted: G            EL   X  (4.4.13-0-default)
        MSR: 8000000300001033 <SF,ME,IR,DR,RI,LE>  CR: 02000424  XER: 20000000
        CFAR: c000000000008468 DAR: 00003ffd84e66880 DSISR: 40000000 SOFTE: 0
        PACATMSCRATCH: 00003ffbc865e680
        GPR00: fffffffcfabc4268 00003ffd84e667a0 00000000100d8c38 000000030544bb80
        GPR04: 0000000000000002 00000000100cf200 0000000000000449 00000000100cf100
        GPR08: 000000000000c350 0000000000002569 0000000000002569 00000000100d6c30
        GPR12: 00000000100d6c28 c00000000e6a6b00 00003ffd84660000 0000000000000000
        GPR16: 0000000000000003 0000000000000449 0000000010002570 0000010009684f20
        GPR20: 0000000000800000 00003ffd84e5f110 00003ffd84e5f7a0 00000000100d0f40
        GPR24: 0000000000000000 0000000000000000 0000000000000000 00003ffff0673f50
        GPR28: 00003ffd84e5e960 00000000003d0f00 00003ffd84e667a0 00003ffd84e5e680
        NIP [c00000000004f188] restore_gprs+0x110/0x17c
        LR [00000000100040b8] 0x100040b8
        Call Trace:
        Instruction dump:
        f8a1fff0 e8e700a8 38a00000 7ca10164 e8a1fff8 e821fff0 7c0007dd 7c421378
        7db142a6 7c3242a6 38800002 7c810164 <e9c100e0> e9e100e8 ea0100f0 ea2100f8
      
      We hit this on large memory machines (> 2TB) but it can also be hit on
      smaller machines when 1TB segments are disabled.
      
      To hit this, you also need to be virtualised to ensure SLBs are
      periodically removed by the hypervisor.
      
      This patches moves the saving of r1 to the SPRG to the region where we
      are guaranteed not to take any further SLB misses.
      
      Fixes: 98ae22e1 ("powerpc: Add helper functions for transactional memory context switching")
      Cc: stable@vger.kernel.org # v3.9+
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Acked-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6bcb8014
  3. 29 6月, 2016 1 次提交
    • M
      powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 · 190ce869
      Michael Neuling 提交于
      Currently we have 2 segments that are bolted for the kernel linear
      mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
      stacks. Anything accessed outside of these regions may need to be
      faulted in. (In practice machines with TM always have 1T segments)
      
      If a machine has < 2TB of memory we never fault on the kernel linear
      mapping as these two segments cover all physical memory. If a machine
      has > 2TB of memory, there may be structures outside of these two
      segments that need to be faulted in. This faulting can occur when
      running as a guest as the hypervisor may remove any SLB that's not
      bolted.
      
      When we treclaim and trecheckpoint we have a window where we need to
      run with the userspace GPRs. This means that we no longer have a valid
      stack pointer in r1. For this window we therefore clear MSR RI to
      indicate that any exceptions taken at this point won't be able to be
      handled. This means that we can't take segment misses in this RI=0
      window.
      
      In this RI=0 region, we currently access the thread_struct for the
      process being context switched to or from. This thread_struct access
      may cause a segment fault since it's not guaranteed to be covered by
      the two bolted segment entries described above.
      
      We've seen this with a crash when running as a guest with > 2TB of
      memory on PowerVM:
      
        Unrecoverable exception 4100 at c00000000004f138
        Oops: Unrecoverable exception, sig: 6 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
        task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
        NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
        REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
        MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
        CFAR: c00000000004ed18 SOFTE: 0
        GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
        GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
        GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
        GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
        GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
        GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
        GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
        GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
        NIP [c00000000004f138] restore_gprs+0xd0/0x16c
        LR [0000000010003a24] 0x10003a24
        Call Trace:
        [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
        [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
        [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
        [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
        [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
        [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
        [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
        [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
        Instruction dump:
        7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
        38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
        ---[ end trace 602126d0a1dedd54 ]---
      
      This fixes this by copying the required data from the thread_struct to
      the stack before we clear MSR RI. Then once we clear RI, we only access
      the stack, guaranteeing there's no segment miss.
      
      We also tighten the region over which we set RI=0 on the treclaim()
      path. This may have a slight performance impact since we're adding an
      mtmsr instruction.
      
      Fixes: 090b9284 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      190ce869
  4. 07 6月, 2015 1 次提交
  5. 16 3月, 2015 1 次提交
  6. 28 5月, 2014 1 次提交
    • S
      powerpc: Fix regression of per-CPU DSCR setting · 1739ea9e
      Sam bobroff 提交于
      Since commit "efcac658 powerpc: Per process DSCR + some fixes (try#4)"
      it is no longer possible to set the DSCR on a per-CPU basis.
      
      The old behaviour was to minipulate the DSCR SPR directly but this is no
      longer sufficient: the value is quickly overwritten by context switching.
      
      This patch stores the per-CPU DSCR value in a kernel variable rather than
      directly in the SPR and it is used whenever a process has not set the DSCR
      itself. The sysfs interface (/sys/devices/system/cpu/cpuN/dscr) is unchanged.
      
      Writes to the old global default (/sys/devices/system/cpu/dscr_default)
      now set all of the per-CPU values and reads return the last written value.
      
      The new per-CPU default is added to the paca_struct and is used everywhere
      outside of sysfs.c instead of the old global default.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1739ea9e
  7. 28 4月, 2014 2 次提交
  8. 23 4月, 2014 2 次提交
  9. 07 4月, 2014 1 次提交
    • M
      powerpc/tm: Disable IRQ in tm_recheckpoint · e6b8fd02
      Michael Neuling 提交于
      We can't take an IRQ when we're about to do a trechkpt as our GPR state is set
      to user GPR values.
      
      We've hit this when running some IBM Java stress tests in the lab resulting in
      the following dump:
      
        cpu 0x3f: Vector: 700 (Program Check) at [c000000007eb3d40]
            pc: c000000000050074: restore_gprs+0xc0/0x148
            lr: 00000000b52a8184
            sp: ac57d360
           msr: 8000000100201030
          current = 0xc00000002c500000
          paca    = 0xc000000007dbfc00     softe: 0     irq_happened: 0x00
            pid   = 34535, comm = Pooled Thread #
        R00 = 00000000b52a8184   R16 = 00000000b3e48fda
        R01 = 00000000ac57d360   R17 = 00000000ade79bd8
        R02 = 00000000ac586930   R18 = 000000000fac9bcc
        R03 = 00000000ade60000   R19 = 00000000ac57f930
        R04 = 00000000f6624918   R20 = 00000000ade79be8
        R05 = 00000000f663f238   R21 = 00000000ac218a54
        R06 = 0000000000000002   R22 = 000000000f956280
        R07 = 0000000000000008   R23 = 000000000000007e
        R08 = 000000000000000a   R24 = 000000000000000c
        R09 = 00000000b6e69160   R25 = 00000000b424cf00
        R10 = 0000000000000181   R26 = 00000000f66256d4
        R11 = 000000000f365ec0   R27 = 00000000b6fdcdd0
        R12 = 00000000f66400f0   R28 = 0000000000000001
        R13 = 00000000ada71900   R29 = 00000000ade5a300
        R14 = 00000000ac2185a8   R30 = 00000000f663f238
        R15 = 0000000000000004   R31 = 00000000f6624918
        pc  = c000000000050074 restore_gprs+0xc0/0x148
        cfar= c00000000004fe28 dont_restore_vec+0x1c/0x1a4
        lr  = 00000000b52a8184
        msr = 8000000100201030   cr  = 24804888
        ctr = 0000000000000000   xer = 0000000000000000   trap =  700
      
      This moves tm_recheckpoint to a C function and moves the tm_restore_sprs into
      that function.  It then adds IRQ disabling over the trechkpt critical section.
      It also sets the TEXASR FS in the signals code to ensure this is never set now
      that we explictly write the TM sprs in tm_recheckpoint.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      cc: stable@vger.kernel.org
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e6b8fd02
  10. 30 10月, 2013 1 次提交
  11. 11 10月, 2013 1 次提交
    • P
      powerpc: Put FP/VSX and VR state into structures · de79f7b9
      Paul Mackerras 提交于
      This creates new 'thread_fp_state' and 'thread_vr_state' structures
      to store FP/VSX state (including FPSCR) and Altivec/VSX state
      (including VSCR), and uses them in the thread_struct.  In the
      thread_fp_state, the FPRs and VSRs are represented as u64 rather
      than double, since we rarely perform floating-point computations
      on the values, and this will enable the structures to be used
      in KVM code as well.  Similarly FPSCR is now a u64 rather than
      a structure of two 32-bit values.
      
      This takes the offsets out of the macros such as SAVE_32FPRS,
      REST_32FPRS, etc.  This enables the same macros to be used for normal
      and transactional state, enabling us to delete the transactional
      versions of the macros.   This also removes the unused do_load_up_fpu
      and do_load_up_altivec, which were in fact buggy since they didn't
      create large enough stack frames to account for the fact that
      load_up_fpu and load_up_altivec are not designed to be called from C
      and assume that their caller's stack frame is an interrupt frame.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      de79f7b9
  12. 03 10月, 2013 2 次提交
  13. 14 8月, 2013 1 次提交
    • P
      powerpc: Fix VRSAVE handling · 408a7e08
      Paul Mackerras 提交于
      Since 2002, the kernel has not saved VRSAVE on exception entry and
      restored it on exit; rather, VRSAVE gets context-switched in _switch.
      This means that when executing in process context in the kernel, the
      userspace VRSAVE value is live in the VRSAVE register.
      
      However, the signal code assumes that current->thread.vrsave holds
      the current VRSAVE value, which is incorrect.  Therefore, this
      commit changes it to use the actual VRSAVE register instead.  (It
      still uses current->thread.vrsave as a temporary location to store
      it in, as __get_user and __put_user can only transfer to/from a
      variable, not an SPR.)
      
      This also modifies the transactional memory code to save and restore
      VRSAVE regardless of whether VMX is enabled in the MSR.  This is
      because accesses to VRSAVE are not controlled by the MSR.VEC bit,
      but can happen at any time.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      408a7e08
  14. 09 8月, 2013 1 次提交
    • M
      powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs · 28e61cc4
      Michael Neuling 提交于
      If a transaction is rolled back, the Target Address Register (TAR), Processor
      Priority Register (PPR) and Data Stream Control Register (DSCR) should be
      restored to the checkpointed values before the transaction began.  Any changes
      to these SPRs inside the transaction should not be visible in the abort
      handler.
      
      Currently Linux doesn't save or restore the checkpointed TAR, PPR or DSCR.  If
      we preempt a processes inside a transaction which has modified any of these, on
      process restore, that same transaction may be aborted we but we won't see the
      checkpointed versions of these SPRs.
      
      This adds checkpointed versions of these SPRs to the thread_struct and adds the
      save/restore of these three SPRs to the treclaim/trechkpt code.
      
      Without this if any of these SPRs are modified during a transaction, users may
      incorrectly see a speculated SPR value even if the transaction is aborted.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Cc: <stable@vger.kernel.org> [v3.10]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      28e61cc4
  15. 30 6月, 2013 1 次提交
  16. 10 4月, 2013 1 次提交
  17. 15 2月, 2013 1 次提交