1. 07 7月, 2017 1 次提交
  2. 04 7月, 2017 1 次提交
  3. 02 7月, 2017 1 次提交
  4. 08 6月, 2017 1 次提交
  5. 05 6月, 2017 2 次提交
  6. 09 5月, 2017 1 次提交
    • M
      powerpc/mm/book3s/64: Rework page table geometry for lower memory usage · ba95b5d0
      Michael Ellerman 提交于
      Recently in commit f6eedbba ("powerpc/mm/hash: Increase VA range to 128TB")
      we increased the virtual address space for user processes to 128TB by default,
      and up to 512TB if user space opts in.
      
      This obviously required expanding the range of the Linux page tables. For Book3s
      64-bit using hash and with PAGE_SIZE=64K, we increased the PGD to 2^15 entries.
      This meant we could cover the full address range, while still being able to
      insert a 16G hugepage at the PGD level and a 16M hugepage in the PMD.
      
      The downside of that geometry is that it uses a lot of memory for the PGD, and
      in particular makes the PGD a 4-page allocation, which means it's much more
      likely to fail under memory pressure.
      
      Instead we can make the PMD larger, so that a single PUD entry maps 16G,
      allowing the 16G hugepages to sit at that level in the tree. We're then able to
      split the remaining bits between the PUG and PGD. We make the PGD slightly
      larger as that results in lower memory usage for typical programs.
      
      When THP is enabled the PMD actually doubles in size, to 2^11 entries, or 2^14
      bytes, which is large but still < PAGE_SIZE.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      ba95b5d0
  7. 28 4月, 2017 1 次提交
  8. 27 4月, 2017 1 次提交
  9. 12 4月, 2017 1 次提交
    • M
      powerpc/mm: Fix swapper_pg_dir size on 64-bit hash w/64K pages · 03dfee6d
      Michael Ellerman 提交于
      Recently in commit f6eedbba ("powerpc/mm/hash: Increase VA range to 128TB"),
      we increased H_PGD_INDEX_SIZE to 15 when we're building with 64K pages. This
      makes it larger than RADIX_PGD_INDEX_SIZE (13), which means the logic to
      calculate MAX_PGD_INDEX_SIZE in book3s/64/pgtable.h is wrong.
      
      The end result is that the PGD (Page Global Directory, ie top level page table)
      of the kernel (aka. swapper_pg_dir), is too small.
      
      This generally doesn't lead to a crash, as we don't use the full range in normal
      operation. However if we try to dump the kernel pagetables we can trigger a
      crash because we walk off the end of the pgd into other memory and eventually
      try to dereference something bogus:
      
        $ cat /sys/kernel/debug/kernel_pagetables
        Unable to handle kernel paging request for data at address 0xe8fece0000000000
        Faulting instruction address: 0xc000000000072314
        cpu 0xc: Vector: 380 (Data SLB Access) at [c0000000daa13890]
            pc: c000000000072314: ptdump_show+0x164/0x430
            lr: c000000000072550: ptdump_show+0x3a0/0x430
           dar: e802cf0000000000
        seq_read+0xf8/0x560
        full_proxy_read+0x84/0xc0
        __vfs_read+0x6c/0x1d0
        vfs_read+0xbc/0x1b0
        SyS_read+0x6c/0x110
        system_call+0x38/0xfc
      
      The root cause is that MAX_PGD_INDEX_SIZE isn't actually computed to be
      the max of H_PGD_INDEX_SIZE or RADIX_PGD_INDEX_SIZE. To fix that move
      the calculation into asm-offsets.c where we can do it easily using
      max().
      
      Fixes: f6eedbba ("powerpc/mm/hash: Increase VA range to 128TB")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      03dfee6d
  10. 04 4月, 2017 1 次提交
    • A
      powerpc/powernv: Introduce address translation services for Nvlink2 · 1ab66d1f
      Alistair Popple 提交于
      Nvlink2 supports address translation services (ATS) allowing devices
      to request address translations from an mmu known as the nest MMU
      which is setup to walk the CPU page tables.
      
      To access this functionality certain firmware calls are required to
      setup and manage hardware context tables in the nvlink processing unit
      (NPU). The NPU also manages forwarding of TLB invalidates (known as
      address translation shootdowns/ATSDs) to attached devices.
      
      This patch exports several methods to allow device drivers to register
      a process id (PASID/PID) in the hardware tables and to receive
      notification of when a device should stop issuing address translation
      requests (ATRs). It also adds a fault handler to allow device drivers
      to demand fault pages in.
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      [mpe: Fix up comment formatting, use flush_tlb_mm()]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1ab66d1f
  11. 01 4月, 2017 2 次提交
  12. 31 3月, 2017 12 次提交
  13. 10 3月, 2017 3 次提交
  14. 01 3月, 2017 1 次提交
    • P
      KVM: PPC: Book3S HV: Fix software walk of guest process page tables · 70cd4c10
      Paul Mackerras 提交于
      This fixes some bugs in the code that walks the guest's page tables.
      These bugs cause MMIO emulation to fail whenever the guest is in
      virtial mode (MMU on), leading to the guest hanging if it tried to
      access a virtio device.
      
      The first bug was that when reading the guest's process table, we were
      using the whole of arch->process_table, not just the field that contains
      the process table base address.  The second bug was that the mask used
      when reading the process table entry to get the radix tree base address,
      RPDB_MASK, had the wrong value.
      
      Fixes: 9e04ba69 ("KVM: PPC: Book3S HV: Add basic infrastructure for radix guests")
      Fixes: e9983344 ("powerpc/mm/radix: Add partition table format & callback")
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      70cd4c10
  15. 28 2月, 2017 1 次提交
  16. 25 2月, 2017 1 次提交
  17. 15 2月, 2017 3 次提交
  18. 14 2月, 2017 1 次提交
  19. 10 2月, 2017 1 次提交
    • D
      powerpc/pseries: Add support for hash table resizing · dbcf929c
      David Gibson 提交于
      This adds support for using two hypercalls to change the size of the
      main hash page table while running as a PAPR guest. For now these
      hypercalls are only in experimental qemu versions.
      
      The interface is two part: first H_RESIZE_HPT_PREPARE is used to
      allocate and prepare the new hash table. This may be slow, but can be
      done asynchronously. Then, H_RESIZE_HPT_COMMIT is used to switch to the
      new hash table. This requires that no CPUs be concurrently updating the
      HPT, and so must be run under stop_machine().
      
      This also adds a debugfs file which can be used to manually control
      HPT resizing or testing purposes.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NPaul Mackerras <paulus@samba.org>
      [mpe: Rename the debugfs file to "hpt_order"]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dbcf929c
  20. 31 1月, 2017 4 次提交