1. 21 4月, 2019 21 次提交
    • A
      powerpc/mm: Add helpers for accessing hash translation related variables · 60458fba
      Aneesh Kumar K.V 提交于
      We want to switch to allocating them runtime only when hash translation is
      enabled. Add helpers so that both book3s and nohash can be adapted to
      upcoming change easily.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      60458fba
    • A
      powerpc/mm: Remove PPC_MM_SLICES #ifdef for book3s64 · 4f40b15f
      Aneesh Kumar K.V 提交于
      Book3s64 always have PPC_MM_SLICES enabled. So remove the unncessary #ifdef
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4f40b15f
    • A
      powerpc/mm: Fix build error with FLATMEM book3s64 config · 6161a373
      Aneesh Kumar K.V 提交于
      The current value of MAX_PHYSMEM_BITS cannot work with 32 bit configs.
      We used to have MAX_PHYSMEM_BITS not defined without SPARSEMEM and 32
      bit configs never expected a value to be set for MAX_PHYSMEM_BITS.
      
      Dependent code such as zsmalloc derived the right values based on other
      fields. Instead of finding a value that works with different configs,
      use new values only for book3s_64. For 64 bit booke, use the definition
      of MAX_PHYSMEM_BITS as per commit a7df61a0 ("[PATCH] ppc64: Increase sparsemem defaults")
      That change was done in 2005 and hopefully will work with book3e 64.
      
      Fixes: 8bc08689 ("powerpc/mm: Only define MAX_PHYSMEM_BITS in SPARSEMEM configurations")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6161a373
    • C
      powerpc/32s: Implement Kernel Userspace Access Protection · a68c31fc
      Christophe Leroy 提交于
      This patch implements Kernel Userspace Access Protection for
      book3s/32.
      
      Due to limitations of the processor page protection capabilities,
      the protection is only against writing. read protection cannot be
      achieved using page protection.
      
      The previous patch modifies the page protection so that RW user
      pages are RW for Key 0 and RO for Key 1, and it sets Key 0 for
      both user and kernel.
      
      This patch changes userspace segment registers are set to Ku 0
      and Ks 1. When kernel needs to write to RW pages, the associated
      segment register is then changed to Ks 0 in order to allow write
      access to the kernel.
      
      In order to avoid having the read all segment registers when
      locking/unlocking the access, some data is kept in the thread_struct
      and saved on stack on exceptions. The field identifies both the
      first unlocked segment and the first segment following the last
      unlocked one. When no segment is unlocked, it contains value 0.
      
      As the hash_page() function is not able to easily determine if a
      protfault is due to a bad kernel access to userspace, protfaults
      need to be handled by handle_page_fault when KUAP is set.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      [mpe: Drop allow_read/write_to/from_user() as they're now in kup.h,
            and adapt allow_user_access() to do nothing when to == NULL]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a68c31fc
    • C
      powerpc/32s: Prepare Kernel Userspace Access Protection · f342adca
      Christophe Leroy 提交于
      This patch prepares Kernel Userspace Access Protection for
      book3s/32.
      
      Due to limitations of the processor page protection capabilities,
      the protection is only against writing. read protection cannot be
      achieved using page protection.
      
      book3s/32 provides the following values for PP bits:
      
      PP00 provides RW for Key 0 and NA for Key 1
      PP01 provides RW for Key 0 and RO for Key 1
      PP10 provides RW for all
      PP11 provides RO for all
      
      Today PP10 is used for RW pages and PP11 for RO pages, and user
      segment register's Kp and Ks are set to 1. This patch modifies
      page protection to use PP01 for RW pages and sets user segment
      registers to Kp 0 and Ks 0.
      
      This will allow to setup Userspace write access protection by
      settng Ks to 1 in the following patch.
      
      Kernel space segment registers remain unchanged.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f342adca
    • C
      powerpc/32s: Implement Kernel Userspace Execution Prevention. · 31ed2b13
      Christophe Leroy 提交于
      To implement Kernel Userspace Execution Prevention, this patch
      sets NX bit on all user segments on kernel entry and clears NX bit
      on all user segments on kernel exit.
      
      Note that powerpc 601 doesn't have the NX bit, so KUEP will not
      work on it. A warning is displayed at startup.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      31ed2b13
    • C
      powerpc/8xx: Add Kernel Userspace Access Protection · 2679f9bd
      Christophe Leroy 提交于
      This patch adds Kernel Userspace Access Protection on the 8xx.
      
      When a page is RO or RW, it is set RO or RW for Key 0 and NA
      for Key 1.
      
      Up to now, the User group is defined with Key 0 for both User and
      Supervisor.
      
      By changing the group to Key 0 for User and Key 1 for Supervisor,
      this patch prevents the Kernel from being able to access user data.
      
      At exception entry, the kernel saves SPRN_MD_AP in the regs struct,
      and reapply the protection. At exception exit it restores SPRN_MD_AP
      with the value saved on exception entry.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      [mpe: Drop allow_read/write_to/from_user() as they're now in kup.h]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2679f9bd
    • C
      powerpc/8xx: Add Kernel Userspace Execution Prevention · 06fbe81b
      Christophe Leroy 提交于
      This patch adds Kernel Userspace Execution Prevention on the 8xx.
      
      When a page is Executable, it is set Executable for Key 0 and NX
      for Key 1.
      
      Up to now, the User group is defined with Key 0 for both User and
      Supervisor.
      
      By changing the group to Key 0 for User and Key 1 for Supervisor,
      this patch prevents the Kernel from being able to execute user code.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      06fbe81b
    • C
      powerpc/8xx: Only define APG0 and APG1 · c341a108
      Christophe Leroy 提交于
      Since the 8xx implements hardware page table walk assistance,
      the PGD entries always point to a 4k aligned page, so the 2 upper
      bits of the APG are not clobbered anymore and remain 0. Therefore
      only APG0 and APG1 are used and need a definition. We set the
      other APG to the lowest permission level.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c341a108
    • C
      powerpc/32: Prepare for Kernel Userspace Access Protection · e2fb9f54
      Christophe Leroy 提交于
      This patch adds ASM macros for saving, restoring and checking
      the KUAP state, and modifies setup_32 to call them on exceptions
      from kernel.
      
      The macros are defined as empty by default for when CONFIG_PPC_KUAP
      is not selected and/or for platforms which don't handle (yet) KUAP.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e2fb9f54
    • C
      powerpc/32: Remove MSR_PR test when returning from syscall · e291b6d5
      Christophe Leroy 提交于
      syscalls are from user only, so we can account time without checking
      whether returning to kernel or user as it will only be user.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e291b6d5
    • M
      powerpc/mm: Detect bad KUAP faults · 5e5be3ae
      Michael Ellerman 提交于
      When KUAP is enabled we have logic to detect page faults that occur
      outside of a valid user access region and are blocked by the AMR.
      
      What we don't have at the moment is logic to detect a fault *within* a
      valid user access region, that has been incorrectly blocked by AMR.
      This is not meant to ever happen, but it can if we incorrectly
      save/restore the AMR, or if the AMR was overwritten for some other
      reason.
      
      Currently if that happens we assume it's just a regular fault that
      will be corrected by handling the fault normally, so we just return.
      But there is nothing the fault handling code can do to fix it, so the
      fault just happens again and we spin forever, leading to soft lockups.
      
      So add some logic to detect that case and WARN() if we ever see it.
      Arguably it should be a BUG(), but it's more polite to fail the access
      and let the kernel continue, rather than taking down the box. There
      should be no data integrity issue with failing the fault rather than
      BUG'ing, as we're just going to disallow an access that should have
      been allowed.
      
      To make the code a little easier to follow, unroll the condition at
      the end of bad_kernel_fault() and comment each case, before adding the
      call to bad_kuap_fault().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5e5be3ae
    • M
      powerpc/64s: Implement KUAP for Radix MMU · 890274c2
      Michael Ellerman 提交于
      Kernel Userspace Access Prevention utilises a feature of the Radix MMU
      which disallows read and write access to userspace addresses. By
      utilising this, the kernel is prevented from accessing user data from
      outside of trusted paths that perform proper safety checks, such as
      copy_{to/from}_user() and friends.
      
      Userspace access is disabled from early boot and is only enabled when
      performing an operation like copy_{to/from}_user(). The register that
      controls this (AMR) does not prevent userspace from accessing itself,
      so there is no need to save and restore when entering and exiting
      userspace.
      
      When entering the kernel from the kernel we save AMR and if it is not
      blocking user access (because eg. we faulted doing a user access) we
      reblock user access for the duration of the exception (ie. the page
      fault) and then restore the AMR when returning back to the kernel.
      
      This feature can be tested by using the lkdtm driver (CONFIG_LKDTM=y)
      and performing the following:
      
        # (echo ACCESS_USERSPACE) > [debugfs]/provoke-crash/DIRECT
      
      If enabled, this should send SIGSEGV to the thread.
      
      We also add paranoid checking of AMR in switch and syscall return
      under CONFIG_PPC_KUAP_DEBUG.
      Co-authored-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      890274c2
    • R
      powerpc/lib: Refactor __patch_instruction() to use __put_user_asm() · ef296729
      Russell Currey 提交于
      __patch_instruction() is called in early boot, and uses
      __put_user_size(), which includes the allow/prevent calls to enforce
      KUAP, which could either be called too early, or in the Radix case,
      forced to use "early_" versions of functions just to safely handle
      this one case.
      
      __put_user_asm() does not do this, and thus is safe to use both in
      early boot, and later on since in this case it should only ever be
      touching kernel memory.
      
      __patch_instruction() was previously refactored to use
      __put_user_size() in order to be able to return -EFAULT, which would
      allow the kernel to patch instructions in userspace, which should
      never happen. This has the functional change of causing faults on
      userspace addresses if KUAP is turned on, which should never happen in
      practice.
      
      A future enhancement could be to double check the patch address is
      definitely allowed to be tampered with by the kernel.
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ef296729
    • R
      powerpc/mm/radix: Use KUEP API for Radix MMU · 1bb2bae2
      Russell Currey 提交于
      Execution protection already exists on radix, this just refactors
      the radix init to provide the KUEP setup function instead.
      
      Thus, the only functional change is that it can now be disabled.
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1bb2bae2
    • R
      powerpc/64: Setup KUP on secondary CPUs · b28c9750
      Russell Currey 提交于
      Some platforms (i.e. Radix MMU) need per-CPU initialisation for KUP.
      
      Any platforms that only want to do KUP initialisation once
      globally can just check to see if they're running on the boot CPU, or
      check if whatever setup they need has already been performed.
      
      Note that this is only for 64-bit.
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b28c9750
    • C
      powerpc: Add a framework for Kernel Userspace Access Protection · de78a9c4
      Christophe Leroy 提交于
      This patch implements a framework for Kernel Userspace Access
      Protection.
      
      Then subarches will have the possibility to provide their own
      implementation by providing setup_kuap() and
      allow/prevent_user_access().
      
      Some platforms will need to know the area accessed and whether it is
      accessed from read, write or both. Therefore source, destination and
      size and handed over to the two functions.
      
      mpe: Rename to allow/prevent rather than unlock/lock, and add
      read/write wrappers. Drop the 32-bit code for now until we have an
      implementation for it. Add kuap to pt_regs for 64-bit as well as
      32-bit. Don't split strings, use pr_crit_ratelimited().
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      de78a9c4
    • C
      powerpc: Add skeleton for Kernel Userspace Execution Prevention · 0fb1c25a
      Christophe Leroy 提交于
      This patch adds a skeleton for Kernel Userspace Execution Prevention.
      
      Then subarches implementing it have to define CONFIG_PPC_HAVE_KUEP
      and provide setup_kuep() function.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      [mpe: Don't split strings, use pr_crit_ratelimited()]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      0fb1c25a
    • C
      powerpc: Add framework for Kernel Userspace Protection · 69795cab
      Christophe Leroy 提交于
      This patch adds a skeleton for Kernel Userspace Protection
      functionnalities like Kernel Userspace Access Protection and Kernel
      Userspace Execution Prevention
      
      The subsequent implementation of KUAP for radix makes use of a MMU
      feature in order to patch out assembly when KUAP is disabled or
      unsupported. This won't work unless there's an entry point for KUP
      support before the feature magic happens, so for PPC64 setup_kup() is
      called early in setup.
      
      On PPC32, feature_fixup() is done too early to allow the same.
      Suggested-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      69795cab
    • M
      powerpc/powernv/idle: Restore AMR/UAMOR/AMOR after idle · 53a712ba
      Michael Ellerman 提交于
      In order to implement KUAP (Kernel Userspace Access Protection) on
      Power9 we will be using the AMR, and therefore indirectly the
      UAMOR/AMOR.
      
      So save/restore these regs in the idle code.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      53a712ba
    • R
      powerpc/powernv/idle: Restore IAMR after idle · a3f3072d
      Russell Currey 提交于
      Without restoring the IAMR after idle, execution prevention on POWER9
      with Radix MMU is overwritten and the kernel can freely execute
      userspace without faulting.
      
      This is necessary when returning from any stop state that modifies
      user state, as well as hypervisor state.
      
      To test how this fails without this patch, load the lkdtm driver and
      do the following:
      
        $ echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT
      
      which won't fault, then boot the kernel with powersave=off, where it
      will fault. Applying this patch will fix this.
      
      Fixes: 3b10d009 ("powerpc/mm/radix: Prevent kernel execution of user space")
      Cc: stable@vger.kernel.org # v4.10+
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Reviewed-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a3f3072d
  2. 20 4月, 2019 19 次提交