1. 09 5月, 2007 1 次提交
    • B
      [POWERPC] Introduce address space "slices" · d0f13e3c
      Benjamin Herrenschmidt 提交于
      The basic issue is to be able to do what hugetlbfs does but with
      different page sizes for some other special filesystems; more
      specifically, my need is:
      
       - Huge pages
      
       - SPE local store mappings using 64K pages on a 4K base page size
      kernel on Cell
      
       - Some special 4K segments in 64K-page kernels for mapping a dodgy
      type of powerpc-specific infiniband hardware that requires 4K MMU
      mappings for various reasons I won't explain here.
      
      The main issues are:
      
       - To maintain/keep track of the page size per "segment" (as we can
      only have one page size per segment on powerpc, which are 256MB
      divisions of the address space).
      
       - To make sure special mappings stay within their allotted
      "segments" (including MAP_FIXED crap)
      
       - To make sure everybody else doesn't mmap/brk/grow_stack into a
      "segment" that is used for a special mapping
      
      Some of the necessary mechanisms to handle that were present in the
      hugetlbfs code, but mostly in ways not suitable for anything else.
      
      The patch relies on some changes to the generic get_unmapped_area()
      that just got merged.  It still hijacks hugetlb callbacks here or
      there as the generic code hasn't been entirely cleaned up yet but
      that shouldn't be a problem.
      
      So what is a slice ?  Well, I re-used the mechanism used formerly by our
      hugetlbfs implementation which divides the address space in
      "meta-segments" which I called "slices".  The division is done using
      256MB slices below 4G, and 1T slices above.  Thus the address space is
      divided currently into 16 "low" slices and 16 "high" slices.  (Special
      case: high slice 0 is the area between 4G and 1T).
      
      Doing so simplifies significantly the tracking of segments and avoids
      having to keep track of all the 256MB segments in the address space.
      
      While I used the "concepts" of hugetlbfs, I mostly re-implemented
      everything in a more generic way and "ported" hugetlbfs to it.
      
      Slices can have an associated page size, which is encoded in the mmu
      context and used by the SLB miss handler to set the segment sizes.  The
      hash code currently doesn't care, it has a specific check for hugepages,
      though I might add a mechanism to provide per-slice hash mapping
      functions in the future.
      
      The slice code provide a pair of "generic" get_unmapped_area() (bottomup
      and topdown) functions that should work with any slice size.  There is
      some trickiness here so I would appreciate people to have a look at the
      implementation of these and let me know if I got something wrong.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      d0f13e3c
  2. 04 12月, 2006 1 次提交
  3. 08 8月, 2006 1 次提交
  4. 01 7月, 2006 1 次提交
  5. 15 6月, 2006 1 次提交
    • P
      powerpc: Use 64k pages without needing cache-inhibited large pages · bf72aeba
      Paul Mackerras 提交于
      Some POWER5+ machines can do 64k hardware pages for normal memory but
      not for cache-inhibited pages.  This patch lets us use 64k hardware
      pages for most user processes on such machines (assuming the kernel
      has been configured with CONFIG_PPC_64K_PAGES=y).  User processes
      start out using 64k pages and get switched to 4k pages if they use any
      non-cacheable mappings.
      
      With this, we use 64k pages for the vmalloc region and 4k pages for
      the imalloc region.  If anything creates a non-cacheable mapping in
      the vmalloc region, the vmalloc region will get switched to 4k pages.
      I don't know of any driver other than the DRM that would do this,
      though, and these machines don't have AGP.
      
      When a region gets switched from 64k pages to 4k pages, we do not have
      to clear out all the 64k HPTEs from the hash table immediately.  We
      use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
      was hashed in as a 64k page or a set of 4k pages.  If hash_page is
      trying to insert a 4k page for a Linux PTE and it sees that it has
      already been inserted as a 64k page, it first invalidates the 64k HPTE
      before inserting the 4k HPTE.  The hash invalidation routines also use
      the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
      set of 4k HPTEs to remove.  With those two changes, we can tolerate a
      mix of 4k and 64k HPTEs in the hash table, and they will all get
      removed when the address space is torn down.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      bf72aeba
  6. 12 6月, 2006 1 次提交
  7. 09 1月, 2006 3 次提交
    • D
      [PATCH] powerpc: Replace VMALLOCBASE with VMALLOC_START · 14c89e7f
      David Gibson 提交于
      On ppc64, we independently define VMALLOCBASE and VMALLOC_START to be
      the same thing: the start of the vmalloc() area at 0xd000000000000000.
      VMALLOC_START is used much more widely, including in generic code, so
      this patch gets rid of the extraneous VMALLOCBASE.
      
      This does require moving the definitions of region IDs from page_64.h
      to pgtable.h, but they don't clearly belong in the former rather than
      the latter, anyway.  While we're moving them, clean up the definitions
      of the REGION_IDs:
      	- Abolish REGION_SIZE, it was only used once, to define
      REGION_MASK anyway
      	- Define the specific region ids in terms of the REGION_ID()
      macro.
      	- Define KERNEL_REGION_ID in terms of PAGE_OFFSET rather than
      KERNELBASE.  It amounts to the same thing, but conceptually this is
      about the region of the linear mapping (which starts at PAGE_OFFSET)
      rather than of the kernel text itself (which is at KERNELBASE).
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      14c89e7f
    • M
      [PATCH] powerpc: Separate usage of KERNELBASE and PAGE_OFFSET · b5666f70
      Michael Ellerman 提交于
      This patch separates usage of KERNELBASE and PAGE_OFFSET. I haven't
      looked at any of the PPC32 code, if we ever want to support Kdump on
      PPC we'll have to do another audit, ditto for iSeries.
      
      This patch makes PAGE_OFFSET the constant, it'll always be 0xC * 1
      gazillion for 64-bit.
      
      To get a physical address from a virtual one you subtract PAGE_OFFSET,
      _not_ KERNELBASE.
      
      KERNELBASE is the virtual address of the start of the kernel, it's
      often the same as PAGE_OFFSET, but _might not be_.
      
      If you want to know something's offset from the start of the kernel
      you should subtract KERNELBASE.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b5666f70
    • M
      [PATCH] powerpc: Add a is_kernel_addr() macro · 51fae6de
      Michael Ellerman 提交于
      There's a bunch of code that compares an address with KERNELBASE to see if
      it's a "kernel address", ie. >= KERNELBASE. The proper test is actually to
      compare with PAGE_OFFSET, since we're going to change KERNELBASE soon.
      
      So replace all of them with an is_kernel_addr() macro that does that.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      51fae6de
  8. 07 11月, 2005 1 次提交
    • B
      [PATCH] ppc64: support 64k pages · 3c726f8d
      Benjamin Herrenschmidt 提交于
      Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel
      base page size to 64K.  The resulting kernel still boots on any
      hardware.  On current machines with 4K pages support only, the kernel
      will maintain 16 "subpages" for each 64K page transparently.
      
      Note that while real 64K capable HW has been tested, the current patch
      will not enable it yet as such hardware is not released yet, and I'm
      still verifying with the firmware architects the proper to get the
      information from the newer hypervisors.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3c726f8d
  9. 10 10月, 2005 1 次提交
  10. 06 9月, 2005 1 次提交
    • D
      [PATCH] Invert sense of SLB class bit · 14b34661
      David Gibson 提交于
      Currently, we set the class bit in kernel SLB entries, and clear it on
      user SLB entries.  On POWER5, ERAT entries created in real mode have
      the class bit clear.  So to avoid flushing kernel ERAT entries on each
      context switch, this patch inverts our usage of the class bit, setting
      it on user SLB entries and clearing it on kernel SLB entries.
      
      Booted on POWER5 and G5.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      14b34661
  11. 01 5月, 2005 1 次提交
  12. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4