1. 05 12月, 2014 1 次提交
    • A
      powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault · aefa5688
      Aneesh Kumar K.V 提交于
      upatepp can get called for a nohpte fault when we find from the linux
      page table that the translation was hashed before. In that case
      we are sure that there is no existing translation, hence we could
      avoid doing tlbie.
      
      We could possibly race with a parallel fault filling the TLB. But
      that should be ok because updatepp is only ever relaxing permissions.
      We also look at linux pte permission bits when filling hash pte
      permission bits. We also hold the linux pte busy bits while
      inserting/updating a hashpte entry, hence a paralle update of
      linux pte is not possible. On the other hand mprotect involves
      ptep_modify_prot_start which cause a hpte invalidate and not updatepp.
      
      Performance number:
      We use randbox_access_bench written by Anton.
      
      Kernel with THP disabled and smaller hash page table size.
      
          86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
           2.10%  random_access_b  random_access_bench              [.] doit
           1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
           1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
           1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
           0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
           0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
      
      With Fix:
      
          27.54%  random_access_b  random_access_bench              [.] doit
          22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
           3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
           1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aefa5688
  2. 02 12月, 2014 2 次提交
  3. 14 11月, 2014 1 次提交
  4. 10 11月, 2014 3 次提交
  5. 13 8月, 2014 3 次提交
  6. 05 8月, 2014 1 次提交
  7. 24 3月, 2014 1 次提交
  8. 17 2月, 2014 1 次提交
  9. 15 1月, 2014 1 次提交
  10. 10 1月, 2014 1 次提交
    • S
      powerpc: add barrier after writing kernel PTE · 47ce8af4
      Scott Wood 提交于
      There is no barrier between something like ioremap() writing to
      a PTE, and returning the value to a caller that may then store the
      pointer in a place that is visible to other CPUs.  Such callers
      generally don't perform barriers of their own.
      
      Even if callers of ioremap() and similar things did use barriers,
      the most logical choise would be smp_wmb(), which is not
      architecturally sufficient when BookE hardware tablewalk is used.  A
      full sync is specified by the architecture.
      
      For userspace mappings, OTOH, we generally already have an lwsync due
      to locking, and if we occasionally take a spurious fault due to not
      having a full sync with hardware tablewalk, it will not be fatal
      because we will retry rather than oops.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      47ce8af4
  11. 09 12月, 2013 1 次提交
  12. 15 11月, 2013 1 次提交
  13. 21 6月, 2013 3 次提交
  14. 30 4月, 2013 1 次提交
    • A
      powerpc: Reduce PTE table memory wastage · 5c1f6ee9
      Aneesh Kumar K.V 提交于
      We allocate one page for the last level of linux page table. With THP and
      large page size of 16MB, that would mean we are wasting large part
      of that page. To map 16MB area, we only need a PTE space of 2K with 64K
      page size. This patch reduce the space wastage by sharing the page
      allocated for the last level of linux page table with multiple pmd
      entries. We call these smaller chunks PTE page fragments and allocated
      page, PTE page.
      
      In order to support systems which doesn't have 64K HPTE support, we also
      add another 2K to PTE page fragment. The second half of the PTE fragments
      is used for storing slot and secondary bit information of an HPTE. With this
      we now have a 4K PTE fragment.
      
      We use a simple approach to share the PTE page. On allocation, we bump the
      PTE page refcount to 16 and share the PTE page with the next 16 pte alloc
      request. This should help in the node locality of the PTE page fragment,
      assuming that the immediate pte alloc request will mostly come from the
      same NUMA node. We don't try to reuse the freed PTE page fragment. Hence
      we could be waisting some space.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5c1f6ee9
  15. 17 3月, 2013 1 次提交
  16. 17 9月, 2012 1 次提交
  17. 05 9月, 2012 1 次提交
  18. 29 3月, 2012 1 次提交
  19. 01 11月, 2011 1 次提交
  20. 19 5月, 2011 2 次提交
  21. 09 12月, 2010 1 次提交
  22. 14 7月, 2010 1 次提交
  23. 07 4月, 2010 1 次提交
  24. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  25. 20 8月, 2009 2 次提交
  26. 11 3月, 2009 1 次提交
  27. 25 7月, 2008 1 次提交
    • B
      powerpc ioremap_prot · a1f242ff
      Benjamin Herrenschmidt 提交于
      This adds ioremap_prot and pte_pgprot() so that one can extract protection
      bits from a PTE and use them to ioremap_prot() (in order to support ptrace
      of VM_IO | VM_PFNMAP as per Rik's patch).
      
      This moves a couple of flag checks around in the ioremap implementations
      of arch/powerpc.  There's a side effect of allowing non-cacheable and
      non-guarded mappings on ppc32 which before would always have _PAGE_GUARDED
      set whenever _PAGE_NO_CACHE is.
      
      (standard ioremap will still set _PAGE_GUARDED, but ioremap_prot will be
      capable of setting such a non guarded mapping).
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a1f242ff
  28. 12 10月, 2007 1 次提交
    • P
      [POWERPC] Use 1TB segments · 1189be65
      Paul Mackerras 提交于
      This makes the kernel use 1TB segments for all kernel mappings and for
      user addresses of 1TB and above, on machines which support them
      (currently POWER5+, POWER6 and PA6T).
      
      We detect that the machine supports 1TB segments by looking at the
      ibm,processor-segment-sizes property in the device tree.
      
      We don't currently use 1TB segments for user addresses < 1T, since
      that would effectively prevent 32-bit processes from using huge pages
      unless we also had a way to revert to using 256MB segments.  That
      would be possible but would involve extra complications (such as
      keeping track of which segment size was used when HPTEs were inserted)
      and is not addressed here.
      
      Parts of this patch were originally written by Ben Herrenschmidt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      1189be65
  29. 13 9月, 2007 1 次提交
  30. 14 6月, 2007 2 次提交
    • D
      [POWERPC] Remove the dregs of APUS support from arch/powerpc · f21f49ea
      David Gibson 提交于
      APUS (the Amiga Power-Up System) is not supported under arch/powerpc
      and it's unlikely it ever will be.  Therefore, this patch removes the
      fragments of APUS support code from arch/powerpc which have been
      copied from arch/ppc.
      
      A few APUS references are left in asm-powerpc in .h files which are
      still used from arch/ppc.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f21f49ea
    • B
      [POWERPC] Rewrite IO allocation & mapping on powerpc64 · 3d5134ee
      Benjamin Herrenschmidt 提交于
      This rewrites pretty much from scratch the handling of MMIO and PIO
      space allocations on powerpc64.  The main goals are:
      
       - Get rid of imalloc and use more common code where possible
       - Simplify the current mess so that PIO space is allocated and
         mapped in a single place for PCI bridges
       - Handle allocation constraints of PIO for all bridges including
         hot plugged ones within the 2GB space reserved for IO ports,
         so that devices on hotplugged busses will now work with drivers
         that assume IO ports fit in an int.
       - Cleanup and separate tracking of the ISA space in the reserved
         low 64K of IO space. No ISA -> Nothing mapped there.
      
      I booted a cell blade with IDE on PIO and MMIO and a dual G5 so
      far, that's it :-)
      
      With this patch, all allocations are done using the code in
      mm/vmalloc.c, though we use the low level __get_vm_area with
      explicit start/stop constraints in order to manage separate
      areas for vmalloc/vmap, ioremap, and PCI IOs.
      
      This greatly simplifies a lot of things, as you can see in the
      diffstat of that patch :-)
      
      A new pair of functions pcibios_map/unmap_io_space() now replace
      all of the previous code that used to manipulate PCI IOs space.
      The allocation is done at mapping time, which is now called from
      scan_phb's, just before the devices are probed (instead of after,
      which is by itself a bug fix). The only other caller is the PCI
      hotplug code for hot adding PCI-PCI bridges (slots).
      
      imalloc is gone, as is the "sub-allocation" thing, but I do beleive
      that hotplug should still work in the sense that the space allocation
      is always done by the PHB, but if you unmap a child bus of this PHB
      (which seems to be possible), then the code should properly tear
      down all the HPTE mappings for that area of the PHB allocated IO space.
      
      I now always reserve the first 64K of IO space for the bridge with
      the ISA bus on it. I have moved the code for tracking ISA in a separate
      file which should also make it smarter if we ever are capable of
      hot unplugging or re-plugging an ISA bridge.
      
      This should have a side effect on platforms like powermac where VGA IOs
      will no longer work. This is done on purpose though as they would have
      worked semi-randomly before. The idea at this point is to isolate drivers
      that might need to access those and fix them by providing a proper
      function to obtain an offset to the legacy IOs of a given bus.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3d5134ee