1. 14 12月, 2015 1 次提交
  2. 18 8月, 2015 1 次提交
  3. 05 12月, 2014 1 次提交
    • A
      powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault · aefa5688
      Aneesh Kumar K.V 提交于
      upatepp can get called for a nohpte fault when we find from the linux
      page table that the translation was hashed before. In that case
      we are sure that there is no existing translation, hence we could
      avoid doing tlbie.
      
      We could possibly race with a parallel fault filling the TLB. But
      that should be ok because updatepp is only ever relaxing permissions.
      We also look at linux pte permission bits when filling hash pte
      permission bits. We also hold the linux pte busy bits while
      inserting/updating a hashpte entry, hence a paralle update of
      linux pte is not possible. On the other hand mprotect involves
      ptep_modify_prot_start which cause a hpte invalidate and not updatepp.
      
      Performance number:
      We use randbox_access_bench written by Anton.
      
      Kernel with THP disabled and smaller hash page table size.
      
          86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
           2.10%  random_access_b  random_access_bench              [.] doit
           1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
           1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
           1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
           0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
           0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
      
      With Fix:
      
          27.54%  random_access_b  random_access_bench              [.] doit
          22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
           3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
           1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aefa5688
  4. 12 11月, 2014 1 次提交
  5. 23 4月, 2014 2 次提交
  6. 09 12月, 2013 1 次提交
  7. 21 6月, 2013 1 次提交
  8. 30 4月, 2013 2 次提交
  9. 04 2月, 2013 1 次提交
  10. 17 9月, 2012 1 次提交
  11. 10 7月, 2012 2 次提交
  12. 27 4月, 2011 1 次提交
  13. 31 3月, 2011 1 次提交
  14. 23 7月, 2010 1 次提交
  15. 30 6月, 2008 1 次提交
  16. 18 6月, 2008 1 次提交
    • P
      [POWERPC] Clear sub-page HPTE present bits when demoting page size · 65ba6cdc
      Paul Mackerras 提交于
      When we demote a slice from 64k to 4k, and we are about to insert an
      HPTE for a 4k subpage and we notice that there is an existing 64k
      HPTE, we first invalidate that HPTE before inserting the new 4k
      subpage HPTE.  Since the bits that encode which hash bucket the old
      HPTE was in overlap with the bits that encode which of the 16 subpages
      have HPTEs, we need to clear out the subpage HPTE-present bits before
      starting to insert HPTEs for the 4k subpages.  If we don't do that, we
      can erroneously think that a subpage already has an HPTE when it
      doesn't.
      
      That in itself wouldn't be such a problem except that when we go to
      update the HPTE that we think is present on machines with a
      hypervisor, the hypervisor can tell us that the HPTE we think is there
      is actually there even though it isn't, which can lead to a process
      getting stuck in a loop, continually faulting.  The reason for the
      confusion is that the AVPN (abbreviated virtual page number) we are
      looking for in the HPTE for a 4k subpage can actually match the AVPN
      in a stale HPTE for another 64k page.  For example, the HPTE for
      the 4k subpage at 0x84000f000 will be in the same hash bucket and have
      the same AVPN as the HPTE for the 64k page at 0x8400f0000.
      
      This fixes the code to clear out the subpage HPTE-present bits.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      65ba6cdc
  17. 24 1月, 2008 1 次提交
    • P
      [POWERPC] Provide a way to protect 4k subpages when using 64k pages · fa28237c
      Paul Mackerras 提交于
      Using 64k pages on 64-bit PowerPC systems makes life difficult for
      emulators that are trying to emulate an ISA, such as x86, which use a
      smaller page size, since the emulator can no longer use the MMU and
      the normal system calls for controlling page protections.  Of course,
      the emulator can emulate the MMU by checking and possibly remapping
      the address for each memory access in software, but that is pretty
      slow.
      
      This provides a facility for such programs to control the access
      permissions on individual 4k sub-pages of 64k pages.  The idea is
      that the emulator supplies an array of protection masks to apply to a
      specified range of virtual addresses.  These masks are applied at the
      level where hardware PTEs are inserted into the hardware page table
      based on the Linux PTEs, so the Linux PTEs are not affected.  Note
      that this new mechanism does not allow any access that would otherwise
      be prohibited; it can only prohibit accesses that would otherwise be
      allowed.  This new facility is only available on 64-bit PowerPC and
      only when the kernel is configured for 64k pages.
      
      The masks are supplied using a new subpage_prot system call, which
      takes a starting virtual address and length, and a pointer to an array
      of protection masks in memory.  The array has a 32-bit word per 64k
      page to be protected; each 32-bit word consists of 16 2-bit fields,
      for which 0 allows any access (that is otherwise allowed), 1 prevents
      write accesses, and 2 or 3 prevent any access.
      
      Implicit in this is that the regions of the address space that are
      protected are switched to use 4k hardware pages rather than 64k
      hardware pages (on machines with hardware 64k page support).  In fact
      the whole process is switched to use 4k hardware pages when the
      subpage_prot system call is used, but this could be improved in future
      to switch only the affected segments.
      
      The subpage protection bits are stored in a 3 level tree akin to the
      page table tree.  The top level of this tree is stored in a structure
      that is appended to the top level of the page table tree, i.e., the
      pgd array.  Since it will often only be 32-bit addresses (below 4GB)
      that are protected, the pointers to the first four bottom level pages
      are also stored in this structure (each bottom level page contains the
      protection bits for 1GB of address space), so the protection bits for
      addresses below 4GB can be accessed with one fewer loads than those
      for higher addresses.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      fa28237c
  18. 29 10月, 2007 1 次提交
    • B
      [POWERPC] powerpc: Fix demotion of segments to 4K pages · f6ab0b92
      Benjamin Herrenschmidt 提交于
      When demoting a process to use 4K HW pages (instead of 64K), which
      happens under various circumstances such as doing cache inhibited
      mappings on machines that do not support 64K CI pages, the assembly
      hash code calls back into the C function flush_hash_page().  This
      function prototype was recently changed to accomodate for 1T segments
      but the assembly call site was not updated, causing applications that
      do demotion to hang.  In addition, when updating the per-CPU PACA for
      the new sizes, we didn't properly update the slice "map", thus causing
      the SLB miss code to re-insert segments for the wrong size.
      
      This fixes both and adds a warning comment next to the C
      implementation to try to avoid problems next time someone changes it.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f6ab0b92
  19. 12 10月, 2007 1 次提交
    • P
      [POWERPC] Use 1TB segments · 1189be65
      Paul Mackerras 提交于
      This makes the kernel use 1TB segments for all kernel mappings and for
      user addresses of 1TB and above, on machines which support them
      (currently POWER5+, POWER6 and PA6T).
      
      We detect that the machine supports 1TB segments by looking at the
      ibm,processor-segment-sizes property in the device tree.
      
      We don't currently use 1TB segments for user addresses < 1T, since
      that would effectively prevent 32-bit processes from using huge pages
      unless we also had a way to revert to using 256MB segments.  That
      would be possible but would involve extra complications (such as
      keeping track of which segment size was used when HPTEs were inserted)
      and is not addressed here.
      
      Parts of this patch were originally written by Ben Herrenschmidt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      1189be65
  20. 03 8月, 2007 1 次提交
    • P
      [POWERPC] Fix special PTE code for secondary hash bucket · 430404ed
      Paul Mackerras 提交于
      The code for mapping special 4k pages on kernels using a 64kB base
      page size was missing the code for doing the RPN (real page number)
      manipulation when inserting the hardware PTE in the secondary hash
      bucket.  It needs the same code as has already been added to the
      code that inserts the HPTE in the primary hash bucket.  This adds it.
      
      Spotted by Ben Herrenschmidt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      430404ed
  21. 09 5月, 2007 1 次提交
  22. 13 4月, 2007 1 次提交
    • P
      [POWERPC] Allow drivers to map individual 4k pages to userspace · 721151d0
      Paul Mackerras 提交于
      Some drivers have resources that they want to be able to map into
      userspace that are 4k in size.  On a kernel configured with 64k pages
      we currently end up mapping the 4k we want plus another 60k of
      physical address space, which could contain anything.  This can
      introduce security problems, for example in the case of an infiniband
      adaptor where the other 60k could contain registers that some other
      program is using for its communications.
      
      This patch adds a new function, remap_4k_pfn, which drivers can use to
      map a single 4k page to userspace regardless of whether the kernel is
      using a 4k or a 64k page size.  Like remap_pfn_range, it would
      typically be called in a driver's mmap function.  It only maps a
      single 4k page, which on a 64k page kernel appears replicated 16 times
      throughout a 64k page.  On a 4k page kernel it reduces to a call to
      remap_pfn_range.
      
      The way this works on a 64k kernel is that a new bit, _PAGE_4K_PFN,
      gets set on the linux PTE.  This alters the way that __hash_page_4K
      computes the real address to put in the HPTE.  The RPN field of the
      linux PTE becomes the 4k RPN directly rather than being interpreted as
      a 64k RPN.  Since the RPN field is 32 bits, this means that physical
      addresses being mapped with remap_4k_pfn have to be below 2^44,
      i.e. 0x100000000000.
      
      The patch also factors out the code in arch/powerpc/mm/hash_utils_64.c
      that deals with demoting a process to use 4k pages into one function
      that gets called in the various different places where we need to do
      that.  There were some discrepancies between exactly what was done in
      the various places, such as a call to spu_flush_all_slbs in one case
      but not in others.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      721151d0
  23. 01 7月, 2006 1 次提交
  24. 15 6月, 2006 1 次提交
    • P
      powerpc: Use 64k pages without needing cache-inhibited large pages · bf72aeba
      Paul Mackerras 提交于
      Some POWER5+ machines can do 64k hardware pages for normal memory but
      not for cache-inhibited pages.  This patch lets us use 64k hardware
      pages for most user processes on such machines (assuming the kernel
      has been configured with CONFIG_PPC_64K_PAGES=y).  User processes
      start out using 64k pages and get switched to 4k pages if they use any
      non-cacheable mappings.
      
      With this, we use 64k pages for the vmalloc region and 4k pages for
      the imalloc region.  If anything creates a non-cacheable mapping in
      the vmalloc region, the vmalloc region will get switched to 4k pages.
      I don't know of any driver other than the DRM that would do this,
      though, and these machines don't have AGP.
      
      When a region gets switched from 64k pages to 4k pages, we do not have
      to clear out all the 64k HPTEs from the hash table immediately.  We
      use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
      was hashed in as a 64k page or a set of 4k pages.  If hash_page is
      trying to insert a 4k page for a Linux PTE and it sees that it has
      already been inserted as a 64k page, it first invalidates the 64k HPTE
      before inserting the 4k HPTE.  The hash invalidation routines also use
      the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
      set of 4k HPTEs to remove.  With those two changes, we can tolerate a
      mix of 4k and 64k HPTEs in the hash table, and they will all get
      removed when the address space is torn down.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      bf72aeba
  25. 09 6月, 2006 1 次提交
    • B
      [PATCH] powerpc: Fix buglet with MMU hash management · c5cf0e30
      Benjamin Herrenschmidt 提交于
      Our MMU hash management code would not set the "C" bit (changed bit) in
      the hardware PTE when updating a RO PTE into a RW PTE. That would cause
      the hardware to possibly to a write back to the hash table to set it on
      the first store access, which in addition to being a performance issue,
      might also hit a bug when running with native hash management (non-HV)
      as our code is specifically optimized for the case where no write back
      happens.
      
      Thus there is a very small therocial window were a hash PTE can become
      corrupted if that HPTE has just been upgraded to read write, a store
      access happens on it, and that races with another processor evicting
      that same slot. Since eviction (caused by an almost full hash) is
      extremely rare, the bug is very unlikely to happen fortunately.
      
      This fixes by allowing the updating of the protection bits in the native
      hash handling to also set (but not clear) the "C" bit, and, in order to
      also improve performances in the general case, by always setting that
      bit on newly inserted hash PTE so that writeback really never happens.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      c5cf0e30
  26. 07 11月, 2005 1 次提交
    • B
      [PATCH] ppc64: support 64k pages · 3c726f8d
      Benjamin Herrenschmidt 提交于
      Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel
      base page size to 64K.  The resulting kernel still boots on any
      hardware.  On current machines with 4K pages support only, the kernel
      will maintain 16 "subpages" for each 64K page transparently.
      
      Note that while real 64K capable HW has been tested, the current patch
      will not enable it yet as such hardware is not released yet, and I'm
      still verifying with the firmware architects the proper to get the
      information from the newer hypervisors.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3c726f8d
  27. 10 10月, 2005 1 次提交
  28. 10 9月, 2005 1 次提交
  29. 30 8月, 2005 1 次提交
    • D
      [PATCH] Remove nested feature sections · 8913ca1c
      David Gibson 提交于
      The {BEGIN,END}_FTR_SECTION asm macros used in ppc64 to nop out
      sections of code at runtime cannot be nested.  However, we do nest
      them in hash_low.S.  We get away with it there, because there is
      nothing between the BEGIN markers for each section.  However, that's
      confusing to someone reading the code.
      
      This patch removes the nested ifset and ifclr feature sections,
      replacing them with a single feature section in the full mask/value
      form.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      8913ca1c
  30. 14 7月, 2005 1 次提交
  31. 01 5月, 2005 1 次提交
    • O
      [PATCH] PPC64: Remove hot busy-wait loop in __hash_page · d03853d5
      Olof Johansson 提交于
      It turns out that our current __hash_page code will do a very hot busy-wait
      loop waiting on _PAGE_BUSY to be cleared.  It even does ldarx/stdcx in the
      loop, which will bounce reservations around like crazy if there's more than
      one CPU spinning on the same PTE (or even another PTE in the same
      reservation granule).  The end result is that each fault takes longer when
      there's contention, which in turn increases the chance of another thread
      hitting the same fault and also piling up.  Not pretty.
      
      There's two options here:
      1. Do an out-of-line busy loop a'la spinlocks with just loads (no
         reserves)
      2. Just bail and refault if needed.
      
      (2) makes sense here: If the PTE is busy, chances are it's in flux anyway
      and the other code path making a change might just be ready to hash it.
      
      This fixes a stampede seen on a large-ish system where a multithreaded
      HPC app faults in the same text pages on several cpus at the same time.
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d03853d5
  32. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4