1. 17 8月, 2017 1 次提交
    • A
      powerpc/mm: Rename find_linux_pte_or_hugepte() · 94171b19
      Aneesh Kumar K.V 提交于
      Add newer helpers to make the function usage simpler. It is always
      recommended to use find_current_mm_pte() for walking the page table.
      If we cannot use find_current_mm_pte(), it should be documented why
      the said usage of __find_linux_pte() is safe against a parallel THP
      split.
      
      For now we have KVM code using __find_linux_pte(). This is because kvm
      code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but
      with PACA soft_enabled = 1. We may want to fix that later and make
      sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that
      we can switch kvm to use find_linux_pte().
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      94171b19
  2. 05 6月, 2017 1 次提交
  3. 01 5月, 2016 2 次提交
  4. 12 10月, 2015 1 次提交
    • A
      powerpc/mm: Differentiate between hugetlb and THP during page walk · 891121e6
      Aneesh Kumar K.V 提交于
      We need to properly identify whether a hugepage is an explicit or
      a transparent hugepage in follow_huge_addr(). We used to depend
      on hugepage shift argument to do that. But in some case that can
      result in wrong results. For ex:
      
      On finding a transparent hugepage we set hugepage shift to PMD_SHIFT.
      But we can end up clearing the thp pte, via pmdp_huge_get_and_clear.
      We do prevent reusing the pfn page via the usage of
      kick_all_cpus_sync(). But that happens after we updated the pte to 0.
      Hence in follow_huge_addr() we can find hugepage shift set, but transparent
      huge page check fail for a thp pte.
      
      NOTE: We fixed a variant of this race against thp split in commit
      691e95fd
      ("powerpc/mm/thp: Make page table walk safe against thp split/collapse")
      
      Without this patch, we may hit the BUG_ON(flags & FOLL_GET) in
      follow_page_mask occasionally.
      
      In the long term, we may want to switch ppc64 64k page size config to
      enable CONFIG_ARCH_WANT_GENERAL_HUGETLB
      Reported-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      891121e6
  5. 10 4月, 2015 1 次提交
  6. 13 8月, 2014 2 次提交
  7. 15 1月, 2014 1 次提交
  8. 21 6月, 2013 2 次提交
  9. 04 6月, 2013 1 次提交
  10. 17 3月, 2013 1 次提交
    • A
      powerpc: Update kernel VSID range · c60ac569
      Aneesh Kumar K.V 提交于
      This patch change the kernel VSID range so that we limit VSID_BITS to 37.
      This enables us to support 64TB with 65 bit VA (37+28). Without this patch
      we have boot hangs on platforms that only support 65 bit VA.
      
      With this patch we now have proto vsid generated as below:
      
      We first generate a 37-bit "proto-VSID". Proto-VSIDs are generated
      from mmu context id and effective segment id of the address.
      
      For user processes max context id is limited to ((1ul << 19) - 5)
      for kernel space, we use the top 4 context ids to map address as below
      0x7fffc -  [ 0xc000000000000000 - 0xc0003fffffffffff ]
      0x7fffd -  [ 0xd000000000000000 - 0xd0003fffffffffff ]
      0x7fffe -  [ 0xe000000000000000 - 0xe0003fffffffffff ]
      0x7ffff -  [ 0xf000000000000000 - 0xf0003fffffffffff ]
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Tested-by: NGeoff Levand <geoff@infradead.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org> [v3.8]
      c60ac569
  11. 04 1月, 2013 1 次提交
    • G
      POWERPC: drivers: remove __dev* attributes. · cad5cef6
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, __devinitdata,
      __devinitconst, and __devexit from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cad5cef6
  12. 17 9月, 2012 1 次提交
  13. 25 5月, 2011 2 次提交
    • P
      mm, powerpc: move the RCU page-table freeing into generic code · 26723911
      Peter Zijlstra 提交于
      In case other architectures require RCU freed page-tables to implement
      gup_fast() and software filled hashes and similar things, provide the
      means to do so by moving the logic into generic code.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Requested-by: NDavid Miller <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26723911
    • P
      powerpc: mmu_gather rework · d6bf29b4
      Peter Zijlstra 提交于
      Fix up powerpc to the new mmu_gather stuff.
      
      PPC has an extra batching queue to RCU free the actual pagetable
      allocations, use the ARCH extentions for that for now.
      
      For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the
      hardware hash-table, keep using per-cpu arrays but flush on context switch
      and use a TLF bit to track the lazy_mmu state.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d6bf29b4
  14. 02 3月, 2011 1 次提交
  15. 10 2月, 2010 1 次提交
    • D
      powerpc: Fix address masking bug in hpte_need_flush() · 77058e1a
      David Gibson 提交于
      Commit f71dc176 'Make
      hpte_need_flush() correctly mask for multiple page sizes' introduced
      bug, which is triggered when a kernel with a 64k base page size is run
      on a system whose hardware does not 64k hash PTEs.  In this case, we
      emulate 64k pages with multiple 4k hash PTEs, however in
      hpte_need_flush() we incorrectly only mask the hardware page size from
      the address, instead of the logical page size.  This causes things to
      go wrong when we later attempt to iterate through the hardware
      subpages of the logical page.
      
      This patch corrects the error.  It has been tested on pSeries bare
      metal by Michael Neuling.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      77058e1a
  16. 30 10月, 2009 1 次提交
  17. 20 8月, 2009 2 次提交
    • B
      powerpc/mm: Move around mmu_gathers definition on 64-bit · a8f7758c
      Benjamin Herrenschmidt 提交于
      The definition for the global structure mmu_gathers, used by generic code,
      is currently defined in multiple places not including anything used by
      64-bit Book3E. This changes it by moving to one place common to all
      processors.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a8f7758c
    • B
      powerpc/mm: Rework & cleanup page table freeing code path · c7cc58a1
      Benjamin Herrenschmidt 提交于
      That patch used to just add a hook to page table flushing but
      pulling that string brought out a whole bunch of issues, so it
      now does that and more:
      
       - We now make the RCU batching of page freeing SMP only, as I
      believe it was intended initially. We make a few more things compile
      to nothing on !CONFIG_SMP
      
       - Some macros are turned into functions, though that forced me to
      out of line a few stuffs due to unsolvable include depenencies,
      however it's probably better that way anyway, it's not -that-
      critical code path.
      
       - 32-bit didn't call pte_free_finish() on tlb_flush() which means
      that it wouldn't push out the batch to RCU for delayed freeing when
      a bunch of page tables have been freed, they would just stay in there
      until the batch gets full.
      
      64-bit BookE will use that hook to maintain the virtually linear
      page tables or the indirect entries in the TLB when using the
      HW loader.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c7cc58a1
  18. 08 7月, 2009 1 次提交
  19. 24 3月, 2009 1 次提交
  20. 16 12月, 2008 1 次提交
  21. 03 12月, 2008 1 次提交
  22. 04 8月, 2008 1 次提交
  23. 25 7月, 2008 1 次提交
  24. 03 7月, 2008 1 次提交
  25. 14 5月, 2008 1 次提交
  26. 13 11月, 2007 1 次提交
    • O
      [POWERPC] Fix CONFIG_SMP=n build error on ppc64 · 9bafbb0c
      Olof Johansson 提交于
      The patch "KVM: fix !SMP build error" change the way smp_call_function()
      actually uses the passed in function names on non-SMP builds.  So
      previously it was never caught that the function passed in was never
      actually defined.
      
      This causes a build error on ppc64_defconfig + CONFIG_SMP=n:
      
      arch/powerpc/mm/tlb_64.c: In function 'pgtable_free_now':
      arch/powerpc/mm/tlb_64.c:71: error: 'pte_free_smp_sync' undeclared (first use in this function)
      arch/powerpc/mm/tlb_64.c:71: error: (Each undeclared identifier is reported only once
      arch/powerpc/mm/tlb_64.c:71: error: for each function it appears in.)
      
      So we need to define it even if CONFIG_SMP is off. Either that or ifdef
      out the smp_call_function() call, but that's ugly.
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      9bafbb0c
  27. 12 10月, 2007 1 次提交
    • P
      [POWERPC] Use 1TB segments · 1189be65
      Paul Mackerras 提交于
      This makes the kernel use 1TB segments for all kernel mappings and for
      user addresses of 1TB and above, on machines which support them
      (currently POWER5+, POWER6 and PA6T).
      
      We detect that the machine supports 1TB segments by looking at the
      ibm,processor-segment-sizes property in the device tree.
      
      We don't currently use 1TB segments for user addresses < 1T, since
      that would effectively prevent 32-bit processes from using huge pages
      unless we also had a way to revert to using 256MB segments.  That
      would be possible but would involve extra complications (such as
      keeping track of which segment size was used when HPTEs were inserted)
      and is not addressed here.
      
      Parts of this patch were originally written by Ben Herrenschmidt.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      1189be65
  28. 14 6月, 2007 2 次提交
    • D
      [POWERPC] Remove the dregs of APUS support from arch/powerpc · f21f49ea
      David Gibson 提交于
      APUS (the Amiga Power-Up System) is not supported under arch/powerpc
      and it's unlikely it ever will be.  Therefore, this patch removes the
      fragments of APUS support code from arch/powerpc which have been
      copied from arch/ppc.
      
      A few APUS references are left in asm-powerpc in .h files which are
      still used from arch/ppc.
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f21f49ea
    • B
      [POWERPC] Rewrite IO allocation & mapping on powerpc64 · 3d5134ee
      Benjamin Herrenschmidt 提交于
      This rewrites pretty much from scratch the handling of MMIO and PIO
      space allocations on powerpc64.  The main goals are:
      
       - Get rid of imalloc and use more common code where possible
       - Simplify the current mess so that PIO space is allocated and
         mapped in a single place for PCI bridges
       - Handle allocation constraints of PIO for all bridges including
         hot plugged ones within the 2GB space reserved for IO ports,
         so that devices on hotplugged busses will now work with drivers
         that assume IO ports fit in an int.
       - Cleanup and separate tracking of the ISA space in the reserved
         low 64K of IO space. No ISA -> Nothing mapped there.
      
      I booted a cell blade with IDE on PIO and MMIO and a dual G5 so
      far, that's it :-)
      
      With this patch, all allocations are done using the code in
      mm/vmalloc.c, though we use the low level __get_vm_area with
      explicit start/stop constraints in order to manage separate
      areas for vmalloc/vmap, ioremap, and PCI IOs.
      
      This greatly simplifies a lot of things, as you can see in the
      diffstat of that patch :-)
      
      A new pair of functions pcibios_map/unmap_io_space() now replace
      all of the previous code that used to manipulate PCI IOs space.
      The allocation is done at mapping time, which is now called from
      scan_phb's, just before the devices are probed (instead of after,
      which is by itself a bug fix). The only other caller is the PCI
      hotplug code for hot adding PCI-PCI bridges (slots).
      
      imalloc is gone, as is the "sub-allocation" thing, but I do beleive
      that hotplug should still work in the sense that the space allocation
      is always done by the PHB, but if you unmap a child bus of this PHB
      (which seems to be possible), then the code should properly tear
      down all the HPTE mappings for that area of the PHB allocated IO space.
      
      I now always reserve the first 64K of IO space for the bridge with
      the ISA bus on it. I have moved the code for tracking ISA in a separate
      file which should also make it smarter if we ever are capable of
      hot unplugging or re-plugging an ISA bridge.
      
      This should have a side effect on platforms like powermac where VGA IOs
      will no longer work. This is done on purpose though as they would have
      worked semi-randomly before. The idea at this point is to isolate drivers
      that might need to access those and fix them by providing a proper
      function to obtain an offset to the legacy IOs of a given bus.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3d5134ee
  29. 09 5月, 2007 1 次提交
  30. 13 4月, 2007 1 次提交
    • B
      [POWERPC] Make tlb flush batch use lazy MMU mode · a741e679
      Benjamin Herrenschmidt 提交于
      The current tlb flush code on powerpc 64 bits has a subtle race since we
      lost the page table lock due to the possible faulting in of new PTEs
      after a previous one has been removed but before the corresponding hash
      entry has been evicted, which can leads to all sort of fatal problems.
      
      This patch reworks the batch code completely. It doesn't use the mmu_gather
      stuff anymore. Instead, we use the lazy mmu hooks that were added by the
      paravirt code. They have the nice property that the enter/leave lazy mmu
      mode pair is always fully contained by the PTE lock for a given range
      of PTEs. Thus we can guarantee that all batches are flushed on a given
      CPU before it drops that lock.
      
      We also generalize batching for any PTE update that require a flush.
      
      Batching is now enabled on a CPU by arch_enter_lazy_mmu_mode() and
      disabled by arch_leave_lazy_mmu_mode(). The code epects that this is
      always contained within a PTE lock section so no preemption can happen
      and no PTE insertion in that range from another CPU. When batching
      is enabled on a CPU, every PTE updates that need a hash flush will
      use the batch for that flush.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      a741e679
  31. 13 7月, 2006 1 次提交
  32. 01 7月, 2006 1 次提交
  33. 15 6月, 2006 1 次提交
    • P
      powerpc: Use 64k pages without needing cache-inhibited large pages · bf72aeba
      Paul Mackerras 提交于
      Some POWER5+ machines can do 64k hardware pages for normal memory but
      not for cache-inhibited pages.  This patch lets us use 64k hardware
      pages for most user processes on such machines (assuming the kernel
      has been configured with CONFIG_PPC_64K_PAGES=y).  User processes
      start out using 64k pages and get switched to 4k pages if they use any
      non-cacheable mappings.
      
      With this, we use 64k pages for the vmalloc region and 4k pages for
      the imalloc region.  If anything creates a non-cacheable mapping in
      the vmalloc region, the vmalloc region will get switched to 4k pages.
      I don't know of any driver other than the DRM that would do this,
      though, and these machines don't have AGP.
      
      When a region gets switched from 64k pages to 4k pages, we do not have
      to clear out all the 64k HPTEs from the hash table immediately.  We
      use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
      was hashed in as a 64k page or a set of 4k pages.  If hash_page is
      trying to insert a 4k page for a Linux PTE and it sees that it has
      already been inserted as a 64k page, it first invalidates the 64k HPTE
      before inserting the 4k HPTE.  The hash invalidation routines also use
      the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
      set of 4k HPTEs to remove.  With those two changes, we can tolerate a
      mix of 4k and 64k HPTEs in the hash table, and they will all get
      removed when the address space is torn down.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      bf72aeba
  34. 10 2月, 2006 1 次提交