1. 24 3月, 2009 1 次提交
    • B
      powerpc/mm: Tweak PTE bit combination definitions · 8d1cf34e
      Benjamin Herrenschmidt 提交于
      This patch tweaks the way some PTE bit combinations are defined, in such a
      way that the 32 and 64-bit variant become almost identical and that will
      make it easier to bring in a new common pte-* file for the new variant
      of the Book3-E support.
      
      The combination of bits defining access to kernel pages are now clearly
      separated from the combination used by userspace and the core VM. The
      resulting generated code should remain identical unless I made a mistake.
      
      Note: While at it, I removed a non-sensical statement related to CONFIG_KGDB
      in ppc_mmu_32.c which could cause kernel mappings to be user accessible when
      that option is enabled. Probably something that bitrot.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d1cf34e
  2. 11 2月, 2009 1 次提交
    • B
      powerpc/mm: Rework I$/D$ coherency (v3) · 8d30c14c
      Benjamin Herrenschmidt 提交于
      This patch reworks the way we do I and D cache coherency on PowerPC.
      
      The "old" way was split in 3 different parts depending on the processor type:
      
         - Hash with per-page exec support (64-bit and >= POWER4 only) does it
      at hashing time, by preventing exec on unclean pages and cleaning pages
      on exec faults.
      
         - Everything without per-page exec support (32-bit hash, 8xx, and
      64-bit < POWER4) does it for all page going to user space in update_mmu_cache().
      
         - Embedded with per-page exec support does it from do_page_fault() on
      exec faults, in a way similar to what the hash code does.
      
      That leads to confusion, and bugs. For example, the method using update_mmu_cache()
      is racy on SMP where another processor can see the new PTE and hash it in before
      we have cleaned the cache, and then blow trying to execute. This is hard to hit but
      I think it has bitten us in the past.
      
      Also, it's inefficient for embedded where we always end up having to do at least
      one more page fault.
      
      This reworks the whole thing by moving the cache sync into two main call sites,
      though we keep different behaviours depending on the HW capability. The call
      sites are set_pte_at() which is now made out of line, and ptep_set_access_flags()
      which joins the former in pgtable.c
      
      The base idea for Embedded with per-page exec support, is that we now do the
      flush at set_pte_at() time when coming from an exec fault, which allows us
      to avoid the double fault problem completely (we can even improve the situation
      more by implementing TLB preload in update_mmu_cache() but that's for later).
      
      If for some reason we didn't do it there and we try to execute, we'll hit
      the page fault, which will do a minor fault, which will hit ptep_set_access_flags()
      to do things like update _PAGE_ACCESSED or _PAGE_DIRTY if needed, we just make
      this guys also perform the I/D cache sync for exec faults now. This second path
      is the catch all for things that weren't cleaned at set_pte_at() time.
      
      For cpus without per-pag exec support, we always do the sync at set_pte_at(),
      thus guaranteeing that when the PTE is visible to other processors, the cache
      is clean.
      
      For the 64-bit hash with per-page exec support case, we keep the old mechanism
      for now. I'll look into changing it later, once I've reworked a bit how we
      use _PAGE_EXEC.
      
      This is also a first step for adding _PAGE_EXEC support for embedded platforms
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d30c14c
  3. 21 12月, 2008 1 次提交
    • B
      powerpc/mm: Rework usage of _PAGE_COHERENT/NO_CACHE/GUARDED · 64b3d0e8
      Benjamin Herrenschmidt 提交于
      Currently, we never set _PAGE_COHERENT in the PTEs, we just OR it in
      in the hash code based on some CPU feature bit.  We also manipulate
      _PAGE_NO_CACHE and _PAGE_GUARDED by hand in all sorts of places.
      
      This changes the logic so that instead, the PTE now contains
      _PAGE_COHERENT for all normal RAM pages thay have I = 0 on platforms
      that need it.  The hash code clears it if the feature bit is not set.
      
      It also adds some clean accessors to setup various valid combinations
      of access flags and change various bits of code to use them instead.
      
      This should help having the PTE actually containing the bit
      combinations that we really want.
      
      I also removed _PAGE_GUARDED from _PAGE_BASE on 44x and instead
      set it explicitely from the TLB miss.  I will ultimately remove it
      completely as it appears that it might not be needed after all
      but in the meantime, having it in the TLB miss makes things a
      lot easier.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NKumar Gala <galak@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      64b3d0e8
  4. 04 8月, 2008 1 次提交
  5. 25 7月, 2008 1 次提交
  6. 14 6月, 2007 1 次提交
  7. 02 5月, 2007 1 次提交
    • D
      [POWERPC] Remove arch/powerpc's dependence on asm-ppc/pg{alloc,table}.h · f88df14b
      David Gibson 提交于
      Currently, all 32-bit powerpc platforms use asm-ppc/pgtable.h and
      asm-ppc/pgalloc.h, even when otherwise compiled with ARCH=powerpc.
      Those asm-ppc files are a fairly nasty tangle of #ifdefs including a
      bunch of things which shouldn't be necessary any more in arch/powerpc.
      
      Cleaning up that mess is going to take a while, but this patch is a
      first step.  It separates the asm-powerpc/pg{alloc,table}.h into 64
      bit and 32 bit versions in asm-powerpc, which the basic .h files in
      asm-powerpc select based on config.  We make a few tiny tweaks to the
      innards of the files along the way, making the outermost ifdefs
      (double-inclusion protection and __KERNEL__) a little cleaner, and
      #including asm-generic/pgtable.h from the top-level
      asm-powerpc/pgtable.h (since both the old 32-bit and 64-bit versions
      ended with such an #include).
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f88df14b
  8. 24 4月, 2007 1 次提交
    • D
      [POWERPC] Cleanup and fix breakage in tlbflush.h · 62102307
      David Gibson 提交于
      BenH's commit a741e679 in powerpc.git,
      although (AFAICT) only intended to affect ppc64, also has side-effects
      which break 44x.  I think 40x, 8xx and Freescale Book E are also
      affected, though I haven't tested them.
      
      The problem lies in unconditionally removing flush_tlb_pending() from
      the versions of flush_tlb_mm(), flush_tlb_range() and
      flush_tlb_kernel_range() used on ppc64 - which are also used the
      embedded platforms mentioned above.
      
      The patch below cleans up the convoluted #ifdef logic in tlbflush.h,
      in the process restoring the necessary flushes for the software TLB
      platforms.  There are three sets of definitions for the flushing
      hooks: the software TLB versions (revised to avoid using names which
      appear to related to TLB batching), the 32-bit hash based versions
      (external functions) amd the 64-bit hash based versions (which
      implement batching).
      
      It also moves the declaration of update_mmu_cache() to always be in
      tlbflush.h (previously it was in tlbflush.h except for PPC64, where it
      was in pgtable.h).
      
      Booted on Ebony (440GP) and compiled for 64-bit and 32-bit
      multiplatform.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      62102307
  9. 13 4月, 2007 1 次提交
    • B
      [POWERPC] Make tlb flush batch use lazy MMU mode · a741e679
      Benjamin Herrenschmidt 提交于
      The current tlb flush code on powerpc 64 bits has a subtle race since we
      lost the page table lock due to the possible faulting in of new PTEs
      after a previous one has been removed but before the corresponding hash
      entry has been evicted, which can leads to all sort of fatal problems.
      
      This patch reworks the batch code completely. It doesn't use the mmu_gather
      stuff anymore. Instead, we use the lazy mmu hooks that were added by the
      paravirt code. They have the nice property that the enter/leave lazy mmu
      mode pair is always fully contained by the PTE lock for a given range
      of PTEs. Thus we can guarantee that all batches are flushed on a given
      CPU before it drops that lock.
      
      We also generalize batching for any PTE update that require a flush.
      
      Batching is now enabled on a CPU by arch_enter_lazy_mmu_mode() and
      disabled by arch_leave_lazy_mmu_mode(). The code epects that this is
      always contained within a PTE lock section so no preemption can happen
      and no PTE insertion in that range from another CPU. When batching
      is enabled on a CPU, every PTE updates that need a hash flush will
      use the batch for that flush.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      a741e679
  10. 26 9月, 2006 1 次提交
    • D
      [PATCH] Standardize pxx_page macros · 46a82b2d
      Dave McCracken 提交于
      One of the changes necessary for shared page tables is to standardize the
      pxx_page macros.  pte_page and pmd_page have always returned the struct
      page associated with their entry, while pte_page_kernel and pmd_page_kernel
      have returned the kernel virtual address.  pud_page and pgd_page, on the
      other hand, return the kernel virtual address.
      
      Shared page tables needs pud_page and pgd_page to return the actual page
      structures.  There are very few actual users of these functions, so it is
      simple to standardize their usage.
      
      Since this is basic cleanup, I am submitting these changes as a standalone
      patch.  Per Hugh Dickins' comments about it, I am also changing the
      pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning.
      Signed-off-by: NDave McCracken <dmccr@us.ibm.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      46a82b2d
  11. 15 6月, 2006 1 次提交
    • P
      powerpc: Use 64k pages without needing cache-inhibited large pages · bf72aeba
      Paul Mackerras 提交于
      Some POWER5+ machines can do 64k hardware pages for normal memory but
      not for cache-inhibited pages.  This patch lets us use 64k hardware
      pages for most user processes on such machines (assuming the kernel
      has been configured with CONFIG_PPC_64K_PAGES=y).  User processes
      start out using 64k pages and get switched to 4k pages if they use any
      non-cacheable mappings.
      
      With this, we use 64k pages for the vmalloc region and 4k pages for
      the imalloc region.  If anything creates a non-cacheable mapping in
      the vmalloc region, the vmalloc region will get switched to 4k pages.
      I don't know of any driver other than the DRM that would do this,
      though, and these machines don't have AGP.
      
      When a region gets switched from 64k pages to 4k pages, we do not have
      to clear out all the 64k HPTEs from the hash table immediately.  We
      use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
      was hashed in as a 64k page or a set of 4k pages.  If hash_page is
      trying to insert a 4k page for a Linux PTE and it sees that it has
      already been inserted as a 64k page, it first invalidates the 64k HPTE
      before inserting the 4k HPTE.  The hash invalidation routines also use
      the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
      set of 4k HPTEs to remove.  With those two changes, we can tolerate a
      mix of 4k and 64k HPTEs in the hash table, and they will all get
      removed when the address space is torn down.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      bf72aeba
  12. 26 4月, 2006 1 次提交
  13. 22 3月, 2006 1 次提交
    • D
      [PATCH] hugepage: Fix hugepage logic in free_pgtables() · 9da61aef
      David Gibson 提交于
      free_pgtables() has special logic to call hugetlb_free_pgd_range() instead
      of the normal free_pgd_range() on hugepage VMAs.  However, the test it uses
      to do so is incorrect: it calls is_hugepage_only_range on a hugepage sized
      range at the start of the vma.  is_hugepage_only_range() will return true
      if the given range has any intersection with a hugepage address region, and
      in this case the given region need not be hugepage aligned.  So, for
      example, this test can return true if called on, say, a 4k VMA immediately
      preceding a (nicely aligned) hugepage VMA.
      
      At present we get away with this because the powerpc version of
      hugetlb_free_pgd_range() is just a call to free_pgd_range().  On ia64 (the
      only other arch with a non-trivial is_hugepage_only_range()) we get away
      with it for a different reason; the hugepage area is not contiguous with
      the rest of the user address space, and VMAs are not permitted in between,
      so the test can't return a false positive there.
      
      Nonetheless this should be fixed.  We do that in the patch below by
      replacing the is_hugepage_only_range() test with an explicit test of the
      VMA using is_vm_hugetlb_page().
      
      This in turn changes behaviour for platforms where is_hugepage_only_range()
      returns false always (everything except powerpc and ia64).  We address this
      by ensuring that hugetlb_free_pgd_range() is defined to be identical to
      free_pgd_range() (instead of a no-op) on everything except ia64.  Even so,
      it will prevent some otherwise possible coalescing of calls down to
      free_pgd_range().  Since this only happens for hugepage VMAs, removing this
      small optimization seems unlikely to cause any trouble.
      
      This patch causes no regressions on the libhugetlbfs testsuite - ppc64
      POWER5 (8-way), ppc64 G5 (2-way) and i386 Pentium M (UP).
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Acked-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9da61aef
  14. 17 3月, 2006 1 次提交
  15. 09 1月, 2006 2 次提交
    • A
      [PATCH] powerpc: sanitize header files for user space includes · 88ced031
      Arnd Bergmann 提交于
      include/asm-ppc/ had #ifdef __KERNEL__ in all header files that
      are not meant for use by user space, include/asm-powerpc does
      not have this yet.
      
      This patch gets us a lot closer there. There are a few cases
      where I was not sure, so I left them out. I have verified
      that no CONFIG_* symbols are used outside of __KERNEL__
      any more and that there are no obvious compile errors when
      including any of the headers in user space libraries.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      88ced031
    • D
      [PATCH] powerpc: Replace VMALLOCBASE with VMALLOC_START · 14c89e7f
      David Gibson 提交于
      On ppc64, we independently define VMALLOCBASE and VMALLOC_START to be
      the same thing: the start of the vmalloc() area at 0xd000000000000000.
      VMALLOC_START is used much more widely, including in generic code, so
      this patch gets rid of the extraneous VMALLOCBASE.
      
      This does require moving the definitions of region IDs from page_64.h
      to pgtable.h, but they don't clearly belong in the former rather than
      the latter, anyway.  While we're moving them, clean up the definitions
      of the REGION_IDs:
      	- Abolish REGION_SIZE, it was only used once, to define
      REGION_MASK anyway
      	- Define the specific region ids in terms of the REGION_ID()
      macro.
      	- Define KERNEL_REGION_ID in terms of PAGE_OFFSET rather than
      KERNELBASE.  It amounts to the same thing, but conceptually this is
      about the region of the linear mapping (which starts at PAGE_OFFSET)
      rather than of the kernel text itself (which is at KERNELBASE).
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      14c89e7f
  16. 19 11月, 2005 2 次提交
  17. 07 11月, 2005 2 次提交
  18. 30 10月, 2005 1 次提交
    • H
      [PATCH] mm: page fault handlers tidyup · 65500d23
      Hugh Dickins 提交于
      Impose a little more consistency on the page fault handlers do_wp_page,
      do_swap_page, do_anonymous_page, do_no_page, do_file_page: why not pass their
      arguments in the same order, called the same names?
      
      break_cow is all very well, but what it did was inlined elsewhere: easier to
      compare if it's brought back into do_wp_page.
      
      do_file_page's fallback to do_no_page dates from a time when we were testing
      pte_file by using it wherever possible: currently it's peculiar to nonlinear
      vmas, so just check that.  BUG_ON if not?  Better not, it's probably page
      table corruption, so just show the pte: hmm, there's a pte_ERROR macro, let's
      use that for do_wp_page's invalid pfn too.
      
      Hah!  Someone in the ppc64 world noticed pte_ERROR was unused so removed it:
      restored (and say "pud" not "pmd" in its pud_ERROR).
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      65500d23
  19. 29 10月, 2005 1 次提交
    • R
      [PATCH] ppc: make phys_mem_access_prot() work with pfns instead of addresses · 8b150478
      Roland Dreier 提交于
      Change the phys_mem_access_prot() function to take a pfn instead of an
      address.  This allows mmap64() to work on /dev/mem for addresses above 4G
      on 32-bit architectures.  We start with a pfn in mmap_mem(), so there's no
      need to convert to an address; in fact, it's actively bad, since the
      conversion can overflow when the address is above 4G.
      
      Similarly fix the ppc32 page_is_ram() function to avoid a conversion to an
      address by directly comparing to max_pfn.  Working with max_pfn instead of
      high_memory fixes page_is_ram() to give the right answer for highmem pages.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      8b150478
  20. 30 8月, 2005 1 次提交
  21. 29 8月, 2005 1 次提交
    • D
      [PATCH] Four level pagetables for ppc64 · e28f7faf
      David Gibson 提交于
      Implement 4-level pagetables for ppc64
      
      This patch implements full four-level page tables for ppc64, thereby
      extending the usable user address range to 44 bits (16T).
      
      The patch uses a full page for the tables at the bottom and top level,
      and a quarter page for the intermediate levels.  It uses full 64-bit
      pointers at every level, thus also increasing the addressable range of
      physical memory.  This patch also tweaks the VSID allocation to allow
      matching range for user addresses (this halves the number of available
      contexts) and adds some #if and BUILD_BUG sanity checks.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      e28f7faf
  22. 22 6月, 2005 1 次提交
    • D
      [PATCH] ppc64: Abolish ioremap_mm · 20cee16c
      David Gibson 提交于
      Currently ppc64 has two mm_structs for the kernel, init_mm and also
      ioremap_mm.  The latter really isn't necessary: this patch abolishes it,
      instead restricting vmallocs to the lower 1TB of the init_mm's range and
      placing io mappings in the upper 1TB.  This simplifies the code in a number
      of places and eliminates an unecessary set of pagetables.  It also tweaks
      the unmap/free path a little, allowing us to remove the unmap_im_area() set
      of page table walkers, replacing them with unmap_vm_area().
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      20cee16c
  23. 06 5月, 2005 1 次提交
    • D
      [PATCH] ppc64: pgtable.h and other header cleanups · 1f8d419e
      David Gibson 提交于
      This patch started as simply removing a few never-used macros from
      asm-ppc64/pgtable.h, then kind of grew.  It now makes a bunch of
      cleanups to the ppc64 low-level header files (with corresponding
      changes to .c files where necessary) such as:
      	- Abolishing never-used macros
      	- Eliminating multiple #defines with the same purpose
      	- Removing pointless macros (cases where just expanding the
      macro everywhere turns out clearer and more sensible)
      	- Removing some cases where macros which could be defined in
      terms of each other weren't
      	- Moving imalloc() related definitions from pgtable.h to their
      own header file (imalloc.h)
      	- Re-arranging headers to group things more logically
      	- Moving all VSID allocation related things to mmu.h, instead
      of being split between mmu.h and mmu_context.h
      	- Removing some reserved space for flags from the PMD - we're
      not using it.
      	- Fix some bugs which broke compile with STRICT_MM_TYPECHECKS.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1f8d419e
  24. 01 5月, 2005 1 次提交
  25. 20 4月, 2005 2 次提交
    • H
      [PATCH] freepgt: arch FIRST_USER_ADDRESS 0 · d455a369
      Hugh Dickins 提交于
      Replace misleading definition of FIRST_USER_PGD_NR 0 by definition of
      FIRST_USER_ADDRESS 0 in all the MMU architectures beyond arm and arm26.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d455a369
    • H
      [PATCH] freepgt: hugetlb_free_pgd_range · 3bf5ee95
      Hugh Dickins 提交于
      ia64 and ppc64 had hugetlb_free_pgtables functions which were no longer being
      called, and it wasn't obvious what to do about them.
      
      The ppc64 case turns out to be easy: the associated tables are noted elsewhere
      and freed later, safe to either skip its hugetlb areas or go through the
      motions of freeing nothing.  Since ia64 does need a special case, restore to
      ppc64 the special case of skipping them.
      
      The ia64 hugetlb case has been broken since pgd_addr_end went in, though it
      probably appeared to work okay if you just had one such area; in fact it's
      been broken much longer if you consider a long munmap spanning from another
      region into the hugetlb region.
      
      In the ia64 hugetlb region, more virtual address bits are available than in
      the other regions, yet the page tables are structured the same way: the page
      at the bottom is larger.  Here we need to scale down each addr before passing
      it to the standard free_pgd_range.  Was about to write a hugely_scaled_down
      macro, but found htlbpage_to_page already exists for just this purpose.  Fixed
      off-by-one in ia64 is_hugepage_only_range.
      
      Uninline free_pgd_range to make it available to ia64.  Make sure the
      vma-gathering loop in free_pgtables cannot join a hugepage_only_range to any
      other (safe to join huges?  probably but don't bother).
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3bf5ee95
  26. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4