1. 05 6月, 2017 4 次提交
  2. 02 6月, 2017 6 次提交
  3. 17 5月, 2017 1 次提交
    • M
      powerpc/mm: Fix crash in page table dump with huge pages · bfb9956a
      Michael Ellerman 提交于
      The page table dump code doesn't know about huge pages, so currently
      it crashes (or walks random memory, usually leading to a crash), if it
      finds a huge page. On Book3S we only see huge pages in the Linux page
      tables when we're using the P9 Radix MMU.
      
      Teaching the code to properly handle huge pages is a bit more involved,
      so for now just prevent the crash.
      
      Cc: stable@vger.kernel.org # v4.10+
      Fixes: 8eb07b18 ("powerpc/mm: Dump linux pagetables")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bfb9956a
  4. 09 5月, 2017 1 次提交
  5. 03 5月, 2017 1 次提交
    • M
      powerpc/mm/radix: Drop support for CPUs without lockless tlbie · 3c9ac2bc
      Michael Ellerman 提交于
      Currently the radix TLB code includes support for CPUs that do *not*
      have MMU_FTR_LOCKLESS_TLBIE. On those CPUs we are required to take a
      global spinlock before issuing a tlbie.
      
      Radix can only be built for 64-bit Book3s CPUs, and of those, only
      POWER4, 970, Cell and PA6T do not have MMU_FTR_LOCKLESS_TLBIE. Although
      it's possible to build a kernel with Radix support that can also boot on
      those CPUs, we happen to know that in reality none of those CPUs support
      the Radix MMU, so the code can never actually run on those CPUs.
      
      So remove the native_tlbie_lock in the Radix TLB code.
      
      Note that there is another lock of the same name in the hash code, which
      is unaffected by this patch.
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3c9ac2bc
  6. 27 4月, 2017 6 次提交
  7. 26 4月, 2017 1 次提交
  8. 21 4月, 2017 1 次提交
    • M
      powerpc/mm: Add support for runtime configuration of ASLR limits · 9fea59bd
      Michael Ellerman 提交于
      Add powerpc support for mmap_rnd_bits and mmap_rnd_compat_bits, which are two
      sysctls that allow a user to configure the number of bits of randomness used for
      ASLR.
      
      Because of the way the Kconfig for ARCH_MMAP_RND_BITS is defined, we have to
      construct at least the MIN value in Kconfig, vs in a header which would be more
      natural. Given that we just go ahead and do it all in Kconfig.
      
      At least according to the code (the documentation makes no mention of it), the
      value is defined as the number of bits of randomisation *of the page*, not the
      address. This makes some sense, with larger page sizes more of the low bits are
      forced to zero, which would reduce the randomisation if we didn't take the
      PAGE_SIZE into account. However it does mean the min/max values have to change
      depending on the PAGE_SIZE in order to actually limit the amount of address
      space consumed by the randomisation.
      
      The result of that is that we have to define the default values based on both
      32-bit vs 64-bit, but also the configured PAGE_SIZE. Furthermore now that we
      have 128TB address space support on Book3S, we also have to take that into
      account.
      
      Finally we can wire up the value in arch_mmap_rnd().
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBhupesh Sharma <bhsharma@redhat.com>
      Tested-by: NBhupesh Sharma <bhsharma@redhat.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      9fea59bd
  9. 19 4月, 2017 3 次提交
    • A
      powerpc/iommu: Do not call PageTransHuge() on tail pages · e889e96e
      Alexey Kardashevskiy 提交于
      The CMA pages migration code does not support compound pages at
      the moment so it performs few tests before proceeding to actual page
      migration.
      
      One of the tests - PageTransHuge() - has VM_BUG_ON_PAGE(PageTail()) as
      it is designed to be called on head pages only. Since we also test for
      PageCompound(), and it contains PageTail() and PageHead(), we can
      simplify the check by leaving just PageCompound() and therefore avoid
      possible VM_BUG_ON_PAGE.
      
      Fixes: 2e5bbb54 ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e889e96e
    • A
      powerpc/mmap: Any hint > 128TB searches the full VA space · 321f7d29
      Aneesh Kumar K.V 提交于
      As part of the new large address space support, processes start out life with a
      128TB virtual address space. However when calling mmap() a process can pass a
      hint address, and if that hint is > 128TB the kernel will use the full 512TB
      address space to try and satisfy the mmap() request.
      
      Currently we have a check that the hint is > 128TB and < 512TB (TASK_SIZE),
      which was added as an optimisation to avoid updating addr_limit unnecessarily
      and also to avoid calling slice_flush_segments() on all CPUs more than
      necessary.
      
      However this has the user-visible side effect that an mmap() hint above 512TB
      does not search the full address space unless a preceding mmap() used a hint
      value > 128TB && < 512TB.
      
      So fix it to treat any hint above 128TB as a hint to search the full address
      space, instead of checking the hint against TASK_SIZE, we instead check if the
      addr_limit is already == TASK_SIZE.
      
      This also brings the ABI in-line with what is proposed on x86. ie, that a hint
      address above 128TB up to and including (2^64)-1 is an indication to search the
      full address space.
      
      Fixes: f4ea6dcb (powerpc/mm: Enable mappings above 128TB)
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      321f7d29
    • A
      powerpc/mm/radix: Use mm->task_size for boundary checking instead of addr_limit · be77e999
      Aneesh Kumar K.V 提交于
      We don't init addr_limit correctly for 32 bit applications. So default to using
      mm->task_size for boundary condition checking. We use addr_limit to only control
      free space search. This makes sure that we do the right thing with 32 bit
      applications.
      
      We should consolidate the usage of TASK_SIZE/mm->task_size and
      mm->context.addr_limit later.
      
      This partially reverts commit fbfef902 (powerpc/mm: Switch some
      TASK_SIZE checks to use mm_context addr_limit).
      
      Fixes: fbfef902 ("powerpc/mm: Switch some TASK_SIZE checks to use mm_context addr_limit")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      be77e999
  10. 13 4月, 2017 1 次提交
  11. 12 4月, 2017 3 次提交
  12. 11 4月, 2017 5 次提交
  13. 05 4月, 2017 1 次提交
  14. 04 4月, 2017 1 次提交
    • A
      powerpc/powernv: Introduce address translation services for Nvlink2 · 1ab66d1f
      Alistair Popple 提交于
      Nvlink2 supports address translation services (ATS) allowing devices
      to request address translations from an mmu known as the nest MMU
      which is setup to walk the CPU page tables.
      
      To access this functionality certain firmware calls are required to
      setup and manage hardware context tables in the nvlink processing unit
      (NPU). The NPU also manages forwarding of TLB invalidates (known as
      address translation shootdowns/ATSDs) to attached devices.
      
      This patch exports several methods to allow device drivers to register
      a process id (PASID/PID) in the hardware tables and to receive
      notification of when a device should stop issuing address translation
      requests (ATRs). It also adds a fault handler to allow device drivers
      to demand fault pages in.
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      [mpe: Fix up comment formatting, use flush_tlb_mm()]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1ab66d1f
  15. 03 4月, 2017 2 次提交
    • O
      powerpc/mm: Remove stale comment about the DART hole · f6f9195b
      Oliver O'Halloran 提交于
      The code to fix the problem it describes was removed in commit
      c40785ad ("powerpc/dart: Use a cachable DART"), and it uses the
      stupid comment style. Away it goooooooooooooes!
      Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f6f9195b
    • A
      powerpc: Avoid taking a data miss on every userspace instruction miss · a7a9dcd8
      Anton Blanchard 提交于
      Early on in do_page_fault() we call store_updates_sp(), regardless of
      the type of exception. For an instruction miss this doesn't make
      sense, because we only use this information to detect if a data miss
      is the result of a stack expansion instruction or not.
      
      Worse still, it results in a data miss within every userspace
      instruction miss handler, because we try and load the very instruction
      we are about to install a pte for!
      
      A simple exec microbenchmark runs 6% faster on POWER8 with this fix:
      
       #include <stdlib.h>
       #include <stdio.h>
       #include <unistd.h>
      
      int main(int argc, char *argv[])
      {
      	unsigned long left = atol(argv[1]);
      	char leftstr[16];
      
      	if (left-- == 0)
      		return 0;
      
      	sprintf(leftstr, "%ld", left);
      	execlp(argv[0], argv[0], leftstr, NULL);
      	perror("exec failed\n");
      
      	return 0;
      }
      
      Pass the number of iterations on the command line (eg 10000) and time
      how long it takes to execute.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a7a9dcd8
  16. 01 4月, 2017 3 次提交
    • A
      powerpc/mm: Enable mappings above 128TB · f4ea6dcb
      Aneesh Kumar K.V 提交于
      Not all user space application is ready to handle wide addresses. It's
      known that at least some JIT compilers use higher bits in pointers to
      encode their information. It collides with valid pointers with 512TB
      addresses and leads to crashes.
      
      To mitigate this, we are not going to allocate virtual address space
      above 128TB by default.
      
      But userspace can ask for allocation from full address space by
      specifying hint address (with or without MAP_FIXED) above 128TB.
      
      If hint address set above 128TB, but MAP_FIXED is not specified, we try
      to look for unmapped area by specified address. If it's already
      occupied, we look for unmapped area in *full* address space, rather than
      from 128TB window.
      
      This approach helps to easily make application's memory allocator aware
      about large address space without manually tracking allocated virtual
      address space.
      
      This is going to be a per mmap decision. ie, we can have some mmaps with
      larger addresses and other that do not.
      
      A sample memory layout looks like:
      
        10000000-10010000 r-xp 00000000 fc:00 9057045          /home/max_addr_512TB
        10010000-10020000 r--p 00000000 fc:00 9057045          /home/max_addr_512TB
        10020000-10030000 rw-p 00010000 fc:00 9057045          /home/max_addr_512TB
        10029630000-10029660000 rw-p 00000000 00:00 0          [heap]
        7fff834a0000-7fff834b0000 rw-p 00000000 00:00 0
        7fff834b0000-7fff83670000 r-xp 00000000 fc:00 9177190  /lib/powerpc64le-linux-gnu/libc-2.23.so
        7fff83670000-7fff83680000 r--p 001b0000 fc:00 9177190  /lib/powerpc64le-linux-gnu/libc-2.23.so
        7fff83680000-7fff83690000 rw-p 001c0000 fc:00 9177190  /lib/powerpc64le-linux-gnu/libc-2.23.so
        7fff83690000-7fff836a0000 rw-p 00000000 00:00 0
        7fff836a0000-7fff836c0000 r-xp 00000000 00:00 0        [vdso]
        7fff836c0000-7fff83700000 r-xp 00000000 fc:00 9177193  /lib/powerpc64le-linux-gnu/ld-2.23.so
        7fff83700000-7fff83710000 r--p 00030000 fc:00 9177193  /lib/powerpc64le-linux-gnu/ld-2.23.so
        7fff83710000-7fff83720000 rw-p 00040000 fc:00 9177193  /lib/powerpc64le-linux-gnu/ld-2.23.so
        7fffdccf0000-7fffdcd20000 rw-p 00000000 00:00 0        [stack]
        1000000000000-1000000010000 rw-p 00000000 00:00 0
        1ffff83710000-1ffff83720000 rw-p 00000000 00:00 0
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f4ea6dcb
    • A
    • A
      powerpc/pseries: Skip using reserved virtual address range · 82228e36
      Aneesh Kumar K.V 提交于
      Now that we use all the available virtual address range, we need to make
      sure we don't generate VSID such that it overlaps with the reserved vsid
      range. Reserved vsid range include the virtual address range used by the
      adjunct partition and also the VRMA virtual segment. We find the context
      value that can result in generating such a VSID and reserve it early in
      boot.
      
      We don't look at the adjunct range, because for now we disable the
      adjunct usage in a Linux LPAR via CAS interface.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [mpe: Rewrite hash__reserve_context_id(), move the rest into pseries]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      82228e36