• M
    powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 · 190ce869
    Michael Neuling 提交于
    Currently we have 2 segments that are bolted for the kernel linear
    mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
    stacks. Anything accessed outside of these regions may need to be
    faulted in. (In practice machines with TM always have 1T segments)
    
    If a machine has < 2TB of memory we never fault on the kernel linear
    mapping as these two segments cover all physical memory. If a machine
    has > 2TB of memory, there may be structures outside of these two
    segments that need to be faulted in. This faulting can occur when
    running as a guest as the hypervisor may remove any SLB that's not
    bolted.
    
    When we treclaim and trecheckpoint we have a window where we need to
    run with the userspace GPRs. This means that we no longer have a valid
    stack pointer in r1. For this window we therefore clear MSR RI to
    indicate that any exceptions taken at this point won't be able to be
    handled. This means that we can't take segment misses in this RI=0
    window.
    
    In this RI=0 region, we currently access the thread_struct for the
    process being context switched to or from. This thread_struct access
    may cause a segment fault since it's not guaranteed to be covered by
    the two bolted segment entries described above.
    
    We've seen this with a crash when running as a guest with > 2TB of
    memory on PowerVM:
    
      Unrecoverable exception 4100 at c00000000004f138
      Oops: Unrecoverable exception, sig: 6 [#1]
      SMP NR_CPUS=2048 NUMA pSeries
      CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
      task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
      NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
      REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
      MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
      CFAR: c00000000004ed18 SOFTE: 0
      GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
      GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
      GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
      GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
      GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
      GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
      GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
      GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
      NIP [c00000000004f138] restore_gprs+0xd0/0x16c
      LR [0000000010003a24] 0x10003a24
      Call Trace:
      [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
      [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
      [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
      [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
      [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
      [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
      [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
      [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
      Instruction dump:
      7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
      38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
      ---[ end trace 602126d0a1dedd54 ]---
    
    This fixes this by copying the required data from the thread_struct to
    the stack before we clear MSR RI. Then once we clear RI, we only access
    the stack, guaranteeing there's no segment miss.
    
    We also tighten the region over which we set RI=0 on the treclaim()
    path. This may have a slight performance impact since we're adding an
    mtmsr instruction.
    
    Fixes: 090b9284 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
    Signed-off-by: NMichael Neuling <mikey@neuling.org>
    Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
    Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
    190ce869
tm.S 12.1 KB