1. 14 12月, 2015 2 次提交
  2. 08 8月, 2015 1 次提交
  3. 25 6月, 2015 3 次提交
  4. 12 5月, 2015 1 次提交
    • A
      powerpc/thp: Serialize pmd clear against a linux page table walk. · 13bd817b
      Aneesh Kumar K.V 提交于
      Serialize against find_linux_pte_or_hugepte() which does lock-less
      lookup in page tables with local interrupts disabled. For huge pages it
      casts pmd_t to pte_t. Since the format of pte_t is different from pmd_t
      we want to prevent transit from pmd pointing to page table to pmd
      pointing to huge page (and back) while interrupts are disabled.  We
      clear pmd to possibly replace it with page table pointer in different
      code paths. So make sure we wait for the parallel
      find_linux_pte_or_hugepage() to finish.
      
      Without this patch, a find_linux_pte_or_hugepte() running in parallel to
      __split_huge_zero_page_pmd() or do_huge_pmd_wp_page_fallback() or
      zap_huge_pmd() can run into the above issue. With
      __split_huge_zero_page_pmd() and do_huge_pmd_wp_page_fallback() we clear
      the hugepage pte before inserting the pmd entry with a regular pgtable
      address. Such a clear need to wait for the parallel
      find_linux_pte_or_hugepte() to finish.
      
      With zap_huge_pmd(), we can run into issues, with a hugepage pte getting
      zapped due to a MADV_DONTNEED while other cpu fault it in as small
      pages.
      Reported-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      13bd817b
  5. 10 4月, 2015 2 次提交
  6. 17 2月, 2015 1 次提交
  7. 13 2月, 2015 1 次提交
  8. 05 12月, 2014 1 次提交
    • A
      powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault · aefa5688
      Aneesh Kumar K.V 提交于
      upatepp can get called for a nohpte fault when we find from the linux
      page table that the translation was hashed before. In that case
      we are sure that there is no existing translation, hence we could
      avoid doing tlbie.
      
      We could possibly race with a parallel fault filling the TLB. But
      that should be ok because updatepp is only ever relaxing permissions.
      We also look at linux pte permission bits when filling hash pte
      permission bits. We also hold the linux pte busy bits while
      inserting/updating a hashpte entry, hence a paralle update of
      linux pte is not possible. On the other hand mprotect involves
      ptep_modify_prot_start which cause a hpte invalidate and not updatepp.
      
      Performance number:
      We use randbox_access_bench written by Anton.
      
      Kernel with THP disabled and smaller hash page table size.
      
          86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
           2.10%  random_access_b  random_access_bench              [.] doit
           1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
           1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
           1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
           0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
           0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
      
      With Fix:
      
          27.54%  random_access_b  random_access_bench              [.] doit
          22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
           3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
           1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aefa5688
  9. 02 12月, 2014 2 次提交
  10. 14 11月, 2014 1 次提交
  11. 10 11月, 2014 3 次提交
  12. 13 8月, 2014 3 次提交
  13. 05 8月, 2014 1 次提交
  14. 24 3月, 2014 1 次提交
  15. 17 2月, 2014 1 次提交
  16. 15 1月, 2014 1 次提交
  17. 10 1月, 2014 1 次提交
    • S
      powerpc: add barrier after writing kernel PTE · 47ce8af4
      Scott Wood 提交于
      There is no barrier between something like ioremap() writing to
      a PTE, and returning the value to a caller that may then store the
      pointer in a place that is visible to other CPUs.  Such callers
      generally don't perform barriers of their own.
      
      Even if callers of ioremap() and similar things did use barriers,
      the most logical choise would be smp_wmb(), which is not
      architecturally sufficient when BookE hardware tablewalk is used.  A
      full sync is specified by the architecture.
      
      For userspace mappings, OTOH, we generally already have an lwsync due
      to locking, and if we occasionally take a spurious fault due to not
      having a full sync with hardware tablewalk, it will not be fatal
      because we will retry rather than oops.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      47ce8af4
  18. 09 12月, 2013 1 次提交
  19. 15 11月, 2013 1 次提交
  20. 21 6月, 2013 3 次提交
  21. 30 4月, 2013 1 次提交
    • A
      powerpc: Reduce PTE table memory wastage · 5c1f6ee9
      Aneesh Kumar K.V 提交于
      We allocate one page for the last level of linux page table. With THP and
      large page size of 16MB, that would mean we are wasting large part
      of that page. To map 16MB area, we only need a PTE space of 2K with 64K
      page size. This patch reduce the space wastage by sharing the page
      allocated for the last level of linux page table with multiple pmd
      entries. We call these smaller chunks PTE page fragments and allocated
      page, PTE page.
      
      In order to support systems which doesn't have 64K HPTE support, we also
      add another 2K to PTE page fragment. The second half of the PTE fragments
      is used for storing slot and secondary bit information of an HPTE. With this
      we now have a 4K PTE fragment.
      
      We use a simple approach to share the PTE page. On allocation, we bump the
      PTE page refcount to 16 and share the PTE page with the next 16 pte alloc
      request. This should help in the node locality of the PTE page fragment,
      assuming that the immediate pte alloc request will mostly come from the
      same NUMA node. We don't try to reuse the freed PTE page fragment. Hence
      we could be waisting some space.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5c1f6ee9
  22. 17 3月, 2013 1 次提交
  23. 17 9月, 2012 1 次提交
  24. 05 9月, 2012 1 次提交
  25. 29 3月, 2012 1 次提交
  26. 01 11月, 2011 1 次提交
  27. 19 5月, 2011 2 次提交
  28. 09 12月, 2010 1 次提交