1. 30 4月, 2013 1 次提交
    • A
      powerpc: Reduce PTE table memory wastage · 5c1f6ee9
      Aneesh Kumar K.V 提交于
      We allocate one page for the last level of linux page table. With THP and
      large page size of 16MB, that would mean we are wasting large part
      of that page. To map 16MB area, we only need a PTE space of 2K with 64K
      page size. This patch reduce the space wastage by sharing the page
      allocated for the last level of linux page table with multiple pmd
      entries. We call these smaller chunks PTE page fragments and allocated
      page, PTE page.
      
      In order to support systems which doesn't have 64K HPTE support, we also
      add another 2K to PTE page fragment. The second half of the PTE fragments
      is used for storing slot and secondary bit information of an HPTE. With this
      we now have a 4K PTE fragment.
      
      We use a simple approach to share the PTE page. On allocation, we bump the
      PTE page refcount to 16 and share the PTE page with the next 16 pte alloc
      request. This should help in the node locality of the PTE page fragment,
      assuming that the immediate pte alloc request will mostly come from the
      same NUMA node. We don't try to reuse the freed PTE page fragment. Hence
      we could be waisting some space.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5c1f6ee9
  2. 17 3月, 2013 1 次提交
    • A
      powerpc: Update kernel VSID range · c60ac569
      Aneesh Kumar K.V 提交于
      This patch change the kernel VSID range so that we limit VSID_BITS to 37.
      This enables us to support 64TB with 65 bit VA (37+28). Without this patch
      we have boot hangs on platforms that only support 65 bit VA.
      
      With this patch we now have proto vsid generated as below:
      
      We first generate a 37-bit "proto-VSID". Proto-VSIDs are generated
      from mmu context id and effective segment id of the address.
      
      For user processes max context id is limited to ((1ul << 19) - 5)
      for kernel space, we use the top 4 context ids to map address as below
      0x7fffc -  [ 0xc000000000000000 - 0xc0003fffffffffff ]
      0x7fffd -  [ 0xd000000000000000 - 0xd0003fffffffffff ]
      0x7fffe -  [ 0xe000000000000000 - 0xe0003fffffffffff ]
      0x7ffff -  [ 0xf000000000000000 - 0xf0003fffffffffff ]
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Tested-by: NGeoff Levand <geoff@infradead.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org> [v3.8]
      c60ac569
  3. 17 9月, 2012 2 次提交
  4. 25 11月, 2011 1 次提交
  5. 01 11月, 2011 1 次提交
  6. 20 9月, 2011 1 次提交
  7. 06 5月, 2011 1 次提交
  8. 04 5月, 2011 1 次提交
  9. 20 4月, 2011 1 次提交
  10. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  11. 09 2月, 2010 1 次提交
  12. 08 12月, 2009 1 次提交
  13. 02 12月, 2009 1 次提交
  14. 27 11月, 2009 1 次提交
  15. 05 11月, 2009 1 次提交
  16. 21 12月, 2008 1 次提交
  17. 17 8月, 2007 2 次提交
  18. 09 5月, 2007 1 次提交
    • B
      [POWERPC] Introduce address space "slices" · d0f13e3c
      Benjamin Herrenschmidt 提交于
      The basic issue is to be able to do what hugetlbfs does but with
      different page sizes for some other special filesystems; more
      specifically, my need is:
      
       - Huge pages
      
       - SPE local store mappings using 64K pages on a 4K base page size
      kernel on Cell
      
       - Some special 4K segments in 64K-page kernels for mapping a dodgy
      type of powerpc-specific infiniband hardware that requires 4K MMU
      mappings for various reasons I won't explain here.
      
      The main issues are:
      
       - To maintain/keep track of the page size per "segment" (as we can
      only have one page size per segment on powerpc, which are 256MB
      divisions of the address space).
      
       - To make sure special mappings stay within their allotted
      "segments" (including MAP_FIXED crap)
      
       - To make sure everybody else doesn't mmap/brk/grow_stack into a
      "segment" that is used for a special mapping
      
      Some of the necessary mechanisms to handle that were present in the
      hugetlbfs code, but mostly in ways not suitable for anything else.
      
      The patch relies on some changes to the generic get_unmapped_area()
      that just got merged.  It still hijacks hugetlb callbacks here or
      there as the generic code hasn't been entirely cleaned up yet but
      that shouldn't be a problem.
      
      So what is a slice ?  Well, I re-used the mechanism used formerly by our
      hugetlbfs implementation which divides the address space in
      "meta-segments" which I called "slices".  The division is done using
      256MB slices below 4G, and 1T slices above.  Thus the address space is
      divided currently into 16 "low" slices and 16 "high" slices.  (Special
      case: high slice 0 is the area between 4G and 1T).
      
      Doing so simplifies significantly the tracking of segments and avoids
      having to keep track of all the 256MB segments in the address space.
      
      While I used the "concepts" of hugetlbfs, I mostly re-implemented
      everything in a more generic way and "ported" hugetlbfs to it.
      
      Slices can have an associated page size, which is encoded in the mmu
      context and used by the SLB miss handler to set the segment sizes.  The
      hash code currently doesn't care, it has a specific check for hugepages,
      though I might add a mechanism to provide per-slice hash mapping
      functions in the future.
      
      The slice code provide a pair of "generic" get_unmapped_area() (bottomup
      and topdown) functions that should work with any slice size.  There is
      some trickiness here so I would appreciate people to have a look at the
      implementation of these and let me know if I got something wrong.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      d0f13e3c
  19. 01 7月, 2006 1 次提交
  20. 29 6月, 2006 1 次提交
  21. 15 6月, 2006 1 次提交
    • P
      powerpc: Use 64k pages without needing cache-inhibited large pages · bf72aeba
      Paul Mackerras 提交于
      Some POWER5+ machines can do 64k hardware pages for normal memory but
      not for cache-inhibited pages.  This patch lets us use 64k hardware
      pages for most user processes on such machines (assuming the kernel
      has been configured with CONFIG_PPC_64K_PAGES=y).  User processes
      start out using 64k pages and get switched to 4k pages if they use any
      non-cacheable mappings.
      
      With this, we use 64k pages for the vmalloc region and 4k pages for
      the imalloc region.  If anything creates a non-cacheable mapping in
      the vmalloc region, the vmalloc region will get switched to 4k pages.
      I don't know of any driver other than the DRM that would do this,
      though, and these machines don't have AGP.
      
      When a region gets switched from 64k pages to 4k pages, we do not have
      to clear out all the 64k HPTEs from the hash table immediately.  We
      use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
      was hashed in as a 64k page or a set of 4k pages.  If hash_page is
      trying to insert a 4k page for a Linux PTE and it sees that it has
      already been inserted as a 64k page, it first invalidates the 64k HPTE
      before inserting the 4k HPTE.  The hash invalidation routines also use
      the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
      set of 4k HPTEs to remove.  With those two changes, we can tolerate a
      mix of 4k and 64k HPTEs in the hash table, and they will all get
      removed when the address space is torn down.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      bf72aeba
  22. 10 10月, 2005 1 次提交
  23. 26 9月, 2005 1 次提交
    • P
      powerpc: Merge enough to start building in arch/powerpc. · 14cf11af
      Paul Mackerras 提交于
      This creates the directory structure under arch/powerpc and a bunch
      of Kconfig files.  It does a first-cut merge of arch/powerpc/mm,
      arch/powerpc/lib and arch/powerpc/platforms/powermac.  This is enough
      to build a 32-bit powermac kernel with ARCH=powerpc.
      
      For now we are getting some unmerged files from arch/ppc/kernel and
      arch/ppc/syslib, or arch/ppc64/kernel.  This makes some minor changes
      to files in those directories and files outside arch/powerpc.
      
      The boot directory is still not merged.  That's going to be interesting.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      14cf11af