1. 11 4月, 2017 3 次提交
    • G
      powerpc/powernv: Recover correct PACA on wakeup from a stop on P9 DD1 · 17ed4c8f
      Gautham R. Shenoy 提交于
      POWER9 DD1.0 hardware has a bug where the SPRs of a thread waking up
      from stop 0,1,2 with ESL=1 can endup being misplaced in the core. Thus
      the HSPRG0 of a thread waking up from can contain the paca pointer of
      its sibling.
      
      This patch implements a context recovery framework within threads of a
      core, by provisioning space in paca_struct for saving every sibling
      threads's paca pointers. Basically, we should be able to arrive at the
      right paca pointer from any of the thread's existing paca pointer.
      
      At bootup, during powernv idle-init, we save the paca address of every
      CPU in each one its siblings paca_struct in the slot corresponding to
      this CPU's index in the core.
      
      On wakeup from a stop, the thread will determine its index in the core
      from the TIR register and recover its PACA pointer by indexing into
      the correct slot in the provisioned space in the current PACA.
      
      Furthermore, ensure that the NVGPRs are restored from the stack on the
      way out by setting the NAPSTATELOST in paca.
      
      [Changelog written with inputs from svaidy@linux.vnet.ibm.com]
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Call it a bug]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      17ed4c8f
    • G
      powerpc/powernv: Move CPU-Offline idle state invocation from smp.c to idle.c · a7cd88da
      Gautham R. Shenoy 提交于
      Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which
      transitions the CPU to the deepest available platform idle state to a
      new function named pnv_cpu_offline() in powernv/idle.c. The rationale
      behind this code movement is that the data required to determine the
      deepest available platform state resides in powernv/idle.c.
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a7cd88da
    • M
      powerpc: Create asm/debugfs.h and move powerpc_debugfs_root there · 7644d581
      Michael Ellerman 提交于
      powerpc_debugfs_root is the dentry representing the root of the
      "powerpc" directory tree in debugfs.
      
      Currently it sits in asm/debug.h, a long with some other things that
      have "debug" in the name, but are otherwise unrelated.
      
      Pull it out into a separate header, which also includes linux/debugfs.h,
      and convert all the users to include debugfs.h instead of debug.h.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7644d581
  2. 04 4月, 2017 1 次提交
    • A
      powerpc/powernv: Introduce address translation services for Nvlink2 · 1ab66d1f
      Alistair Popple 提交于
      Nvlink2 supports address translation services (ATS) allowing devices
      to request address translations from an mmu known as the nest MMU
      which is setup to walk the CPU page tables.
      
      To access this functionality certain firmware calls are required to
      setup and manage hardware context tables in the nvlink processing unit
      (NPU). The NPU also manages forwarding of TLB invalidates (known as
      address translation shootdowns/ATSDs) to attached devices.
      
      This patch exports several methods to allow device drivers to register
      a process id (PASID/PID) in the hardware tables and to receive
      notification of when a device should stop issuing address translation
      requests (ATRs). It also adds a fault handler to allow device drivers
      to demand fault pages in.
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      [mpe: Fix up comment formatting, use flush_tlb_mm()]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1ab66d1f
  3. 03 4月, 2017 1 次提交
    • M
      powerpc/book3s: Print task info if we take a machine check in user mode · 63f44d65
      Michael Ellerman 提交于
      For an MCE (Machine Check Exception) that hits while in user mode
      MSR(PR=1), print the task info to the console MCE error log. This may
      help to identify an application that triggered the MCE.
      
      After this patch the MCE console looks like:
      
        Severe Machine check interrupt [Recovered]
          NIP: [0000000010039778] PID: 762 Comm: ebizzy
          Initiator: CPU
          Error type: SLB [Multihit]
            Effective address: 0000000010039778
      
        Severe Machine check interrupt [Not recovered]
          NIP: [0000000010039778] PID: 763 Comm: ebizzy
          Initiator: CPU
          Error type: UE [Page table walk ifetch]
            Effective address: 0000000010039778
        ebizzy[763]: unhandled signal 7 at 0000000010039778 nip 0000000010039778 lr 0000000010001b44 code 30004
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      63f44d65
  4. 01 4月, 2017 4 次提交
    • A
      powerpc/mm: Enable mappings above 128TB · f4ea6dcb
      Aneesh Kumar K.V 提交于
      Not all user space application is ready to handle wide addresses. It's
      known that at least some JIT compilers use higher bits in pointers to
      encode their information. It collides with valid pointers with 512TB
      addresses and leads to crashes.
      
      To mitigate this, we are not going to allocate virtual address space
      above 128TB by default.
      
      But userspace can ask for allocation from full address space by
      specifying hint address (with or without MAP_FIXED) above 128TB.
      
      If hint address set above 128TB, but MAP_FIXED is not specified, we try
      to look for unmapped area by specified address. If it's already
      occupied, we look for unmapped area in *full* address space, rather than
      from 128TB window.
      
      This approach helps to easily make application's memory allocator aware
      about large address space without manually tracking allocated virtual
      address space.
      
      This is going to be a per mmap decision. ie, we can have some mmaps with
      larger addresses and other that do not.
      
      A sample memory layout looks like:
      
        10000000-10010000 r-xp 00000000 fc:00 9057045          /home/max_addr_512TB
        10010000-10020000 r--p 00000000 fc:00 9057045          /home/max_addr_512TB
        10020000-10030000 rw-p 00010000 fc:00 9057045          /home/max_addr_512TB
        10029630000-10029660000 rw-p 00000000 00:00 0          [heap]
        7fff834a0000-7fff834b0000 rw-p 00000000 00:00 0
        7fff834b0000-7fff83670000 r-xp 00000000 fc:00 9177190  /lib/powerpc64le-linux-gnu/libc-2.23.so
        7fff83670000-7fff83680000 r--p 001b0000 fc:00 9177190  /lib/powerpc64le-linux-gnu/libc-2.23.so
        7fff83680000-7fff83690000 rw-p 001c0000 fc:00 9177190  /lib/powerpc64le-linux-gnu/libc-2.23.so
        7fff83690000-7fff836a0000 rw-p 00000000 00:00 0
        7fff836a0000-7fff836c0000 r-xp 00000000 00:00 0        [vdso]
        7fff836c0000-7fff83700000 r-xp 00000000 fc:00 9177193  /lib/powerpc64le-linux-gnu/ld-2.23.so
        7fff83700000-7fff83710000 r--p 00030000 fc:00 9177193  /lib/powerpc64le-linux-gnu/ld-2.23.so
        7fff83710000-7fff83720000 rw-p 00040000 fc:00 9177193  /lib/powerpc64le-linux-gnu/ld-2.23.so
        7fffdccf0000-7fffdcd20000 rw-p 00000000 00:00 0        [stack]
        1000000000000-1000000010000 rw-p 00000000 00:00 0
        1ffff83710000-1ffff83720000 rw-p 00000000 00:00 0
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f4ea6dcb
    • A
      powerpc/pseries: Skip using reserved virtual address range · 82228e36
      Aneesh Kumar K.V 提交于
      Now that we use all the available virtual address range, we need to make
      sure we don't generate VSID such that it overlaps with the reserved vsid
      range. Reserved vsid range include the virtual address range used by the
      adjunct partition and also the VRMA virtual segment. We find the context
      value that can result in generating such a VSID and reserve it early in
      boot.
      
      We don't look at the adjunct range, because for now we disable the
      adjunct usage in a Linux LPAR via CAS interface.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [mpe: Rewrite hash__reserve_context_id(), move the rest into pseries]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      82228e36
    • A
      powerpc/mm/hash: Store addr_limit in PACA · bb183221
      Aneesh Kumar K.V 提交于
      We optmize the slice page size array copy to paca by copying only the
      range based on addr_limit. This will require us to not look at page size
      array beyond addr_limit in PACA on slb fault. To enable that copy task
      size to paca which will be used during slb fault.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [mpe: Rename from task_size to addr_limit, consolidate #ifdefs]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bb183221
    • A
      powerpc/mm: Add addr_limit to mm_context and use it to derive max slice index · 957b778a
      Aneesh Kumar K.V 提交于
      In the followup patch, we will increase the slice array size to handle
      512TB range, but will limit the max addr to 128TB. Avoid doing
      unnecessary computation and avoid doing slice mask related operation
      above address limit.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      957b778a
  5. 31 3月, 2017 17 次提交
  6. 21 3月, 2017 2 次提交
  7. 20 3月, 2017 3 次提交
  8. 16 3月, 2017 1 次提交
  9. 10 3月, 2017 5 次提交
  10. 06 3月, 2017 2 次提交
    • M
      powerpc/64: Fix L1D cache shape vector reporting L1I values · 9c7a0086
      Michael Ellerman 提交于
      It seems we didn't pay quite enough attention when testing the new cache
      shape vectors, which means we didn't notice the bug where the vector for
      the L1D was using the L1I values. Fix it, resulting in eg:
      
        L1I  cache size:     0x8000      32768B         32K
        L1I  line size:        0x80       8-way associative
        L1D  cache size:    0x10000      65536B         64K
        L1D  line size:        0x80       8-way associative
      
      Fixes: 98a5f361 ("powerpc: Add new cache geometry aux vectors")
      Cut-and-paste-bug-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Badly-reviewed-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9c7a0086
    • S
      powerpc: Update to new option-vector-5 format for CAS · 014d02cb
      Suraj Jitindar Singh 提交于
      On POWER9 the ibm,client-architecture-support (CAS) negotiation process
      has been updated to change how the host to guest negotiation is done for
      the new hash/radix mmu as well as the nest mmu, process tables and guest
      translation shootdown (GTSE).
      
      This is documented in the unreleased PAPR ACR "CAS option vector
      additions for P9".
      
      The host tells the guest which options it supports in
      ibm,arch-vec-5-platform-support. The guest then chooses a subset of these
      to request in the CAS call and these are agreed to in the
      ibm,architecture-vec-5 property of the chosen node.
      
      Thus we read ibm,arch-vec-5-platform-support and make our selection before
      calling CAS. We then parse the ibm,architecture-vec-5 property of the
      chosen node to check whether we should run as hash or radix.
      
      ibm,arch-vec-5-platform-support format:
      
      index value pairs: <index, val> ... <index, val>
      
      index: Option vector 5 byte number
      val:   Some representation of supported values
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Acked-by: NPaul Mackerras <paulus@ozlabs.org>
      [mpe: Don't print about unknown options, be consistent with OV5_FEAT]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      014d02cb
  11. 04 3月, 2017 1 次提交