1. 27 1月, 2018 1 次提交
  2. 11 4月, 2017 1 次提交
    • M
      powerpc/nohash: Fix use of mmu_has_feature() in setup_initial_memory_limit() · 4868e350
      Michael Ellerman 提交于
      setup_initial_memory_limit() is called from early_init_devtree(), which
      runs prior to feature patching. If the kernel is built with CONFIG_JUMP_LABEL=y
      and CONFIG_JUMP_LABEL_FEATURE_CHECKS=y then we will potentially get the
      wrong value.
      
      If we also have CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG=y we get a warning
      and backtrace:
      
        Warning! mmu_has_feature() used prior to jump label init!
        CPU: 0 PID: 0 Comm: swapper Not tainted 4.11.0-rc4-gccN-next-20170331-g6af2434 #1
        Call Trace:
        [c000000000fc3d50] [c000000000a26c30] .dump_stack+0xa8/0xe8 (unreliable)
        [c000000000fc3de0] [c00000000002e6b8] .setup_initial_memory_limit+0xa4/0x104
        [c000000000fc3e60] [c000000000d5c23c] .early_init_devtree+0xd0/0x2f8
        [c000000000fc3f00] [c000000000d5d3b0] .early_setup+0x90/0x11c
        [c000000000fc3f90] [c000000000000520] start_here_multiplatform+0x68/0x80
      
      Fix it by using early_mmu_has_feature().
      
      Fixes: c12e6f24 ("powerpc: Add option to use jump label for mmu_has_feature()")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4868e350
  3. 10 12月, 2016 1 次提交
    • C
      powerpc/8xx: Implement support of hugepages · 4b914286
      Christophe Leroy 提交于
      8xx uses a two level page table with two different linux page size
      support (4k and 16k). 8xx also support two different hugepage sizes
      512k and 8M. In order to support them on linux we define two different
      page table layout.
      
      The size of pages is in the PGD entry, using PS field (bits 28-29):
      00 : Small pages (4k or 16k)
      01 : 512k pages
      10 : reserved
      11 : 8M pages
      
      For 512K hugepage size a pgd entry have the below format
      [<hugepte address >0101] . The hugepte table allocated will contain 8
      entries pointing to 512K huge pte in 4k pages mode and 64 entries in
      16k pages mode.
      
      For 8M in 16k mode, a pgd entry have the below format
      [<hugepte address >1101] . The hugepte table allocated will contain 8
      entries pointing to 8M huge pte.
      
      For 8M in 4k mode, multiple pgd entries point to the same hugepte
      address and pgd entry will have the below format
      [<hugepte address>1101]. The hugepte table allocated will only have one
      entry.
      
      For the time being, we do not support CPU15 ERRATA when HUGETLB is
      selected
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits)
      Signed-off-by: NScott Wood <oss@buserror.net>
      4b914286
  4. 01 8月, 2016 1 次提交
  5. 05 3月, 2016 1 次提交
  6. 28 10月, 2015 1 次提交
    • S
      powerpc/fsl-booke-64: Don't limit ppc64_rma_size to one TLB entry · eba5de8d
      Scott Wood 提交于
      This is required for kdump to work when loaded at at an address that
      does not fall within the first TLB entry -- which can easily happen
      because while the lower limit is enforced via reserved memory, which
      doesn't affect how much is mapped, the upper limit is enforced via a
      different mechanism that does.  Thus, more TLB entries are needed than
      would normally be used, as the total memory to be mapped might not be a
      power of two.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      eba5de8d
  7. 23 10月, 2015 1 次提交
    • S
      powerpc/85xx: Load all early TLB entries at once · d9e1831a
      Scott Wood 提交于
      Use an AS=1 trampoline TLB entry to allow all normal TLB1 entries to
      be loaded at once.  This avoids the need to keep the translation that
      code is executing from in the same TLB entry in the final TLB
      configuration as during early boot, which in turn is helpful for
      relocatable kernels (e.g. kdump) where the kernel is not running from
      what would be the first TLB entry.
      
      On e6500, we limit map_mem_in_cams() to the primary hwthread of a
      core (the boot cpu is always considered primary, as a kdump kernel
      can be entered on any cpu).  Each TLB only needs to be set up once,
      and when we do, we don't want another thread to be running when we
      create a temporary trampoline TLB1 entry.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      d9e1831a
  8. 27 5月, 2015 1 次提交
  9. 04 2月, 2015 1 次提交
  10. 31 1月, 2015 1 次提交
  11. 13 8月, 2014 1 次提交
  12. 23 5月, 2014 1 次提交
    • S
      powerpc/fsl-booke64: Set vmemmap_psize to 4K · e57eeae4
      Scott Wood 提交于
      The only way Freescale booke chips support mappings larger than 4K
      is via TLB1.  The only way we support (direct) TLB1 entries is via
      hugetlb, which is not what map_kernel_page() does when given a large
      page size.
      
      Without this, a kernel with CONFIG_SPARSEMEM_VMEMMAP enabled crashes on
      boot with messages such as:
      
      PID hash table entries: 4096 (order: 3, 32768 bytes)
      Sorting __ex_table...
      BUG: Bad page state in process swapper  pfn:00a2f
      page:8000040000023a48 count:0 mapcount:0 mapping:0000040000ffce48 index:0x40000ffbe50
      page flags: 0x40000ffda40(active|arch_1|private|private_2|head|tail|swapcache|mappedtodisk|reclaim|swapbacked|unevictable|mlocked)
      page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
      bad because of flags:
      page flags: 0x311840(active|private|private_2|swapcache|unevictable|mlocked)
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper Not tainted 3.15.0-rc1-00003-g7fa250c #299
      Call Trace:
      [c00000000098ba20] [c000000000008b3c] .show_stack+0x7c/0x1cc (unreliable)
      [c00000000098baf0] [c00000000060aa50] .dump_stack+0x88/0xb4
      [c00000000098bb70] [c0000000000c0468] .bad_page+0x144/0x1a0
      [c00000000098bc10] [c0000000000c0628] .free_pages_prepare+0x164/0x17c
      [c00000000098bcc0] [c0000000000c24cc] .free_hot_cold_page+0x48/0x214
      [c00000000098bd60] [c00000000086c318] .free_all_bootmem+0x1fc/0x354
      [c00000000098be70] [c00000000085da84] .mem_init+0xac/0xdc
      [c00000000098bef0] [c0000000008547b0] .start_kernel+0x21c/0x4d4
      [c00000000098bf90] [c000000000000448] .start_here_common+0x20/0x58
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      e57eeae4
  13. 20 3月, 2014 1 次提交
  14. 18 1月, 2014 1 次提交
  15. 10 1月, 2014 1 次提交
    • S
      powerpc/e6500: TLB miss handler with hardware tablewalk support · 28efc35f
      Scott Wood 提交于
      There are a few things that make the existing hw tablewalk handlers
      unsuitable for e6500:
      
       - Indirect entries go in TLB1 (though the resulting direct entries go in
         TLB0).
      
       - It has threads, but no "tlbsrx." -- so we need a spinlock and
         a normal "tlbsx".  Because we need this lock, hardware tablewalk
         is mandatory on e6500 unless we want to add spinlock+tlbsx to
         the normal bolted TLB miss handler.
      
       - TLB1 has no HES (nor next-victim hint) so we need software round robin
         (TODO: integrate this round robin data with hugetlb/KVM)
      
       - The existing tablewalk handlers map half of a page table at a time,
         because IBM hardware has a fixed 1MiB indirect page size.  e6500
         has variable size indirect entries, with a minimum of 2MiB.
         So we can't do the half-page indirect mapping, and even if we
         could it would be less efficient than mapping the full page.
      
       - Like on e5500, the linear mapping is bolted, so we don't need the
         overhead of supporting nested tlb misses.
      
      Note that hardware tablewalk does not work in rev1 of e6500.
      We do not expect to support e6500 rev1 in mainline Linux.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      28efc35f
  16. 02 12月, 2013 1 次提交
  17. 23 11月, 2013 1 次提交
  18. 01 7月, 2013 1 次提交
    • P
      powerpc: Delete __cpuinit usage from all users · 061d19f2
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the powerpc uses of the __cpuinit macros.  There
      are no __CPUINIT users in assembly files in powerpc.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      061d19f2
  19. 06 3月, 2013 1 次提交
    • K
      powerpc/fsl-booke: Support detection of page sizes on e6500 · 1b291873
      Kumar Gala 提交于
      The e6500 core used on T4240 and B4860 SoCs from FSL implements MMUv2 of
      the Power Book-E Architecture.  However there are some minor differences
      between it and other Book-E implementations.
      
      Add support to parse SPRN_TLB1PS for the variable page sizes supported.
      In the future this should be expanded for more page sizes supported on
      e6500 as well as other MMU features.
      
      This patch is based on code from Scott Wood.
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      1b291873
  20. 09 12月, 2011 1 次提交
    • T
      memblock: s/memblock_analyze()/memblock_allow_resize()/ and update users · 1aadc056
      Tejun Heo 提交于
      The only function of memblock_analyze() is now allowing resize of
      memblock region arrays.  Rename it to memblock_allow_resize() and
      update its users.
      
      * The following users remain the same other than renaming.
      
        arm/mm/init.c::arm_memblock_init()
        microblaze/kernel/prom.c::early_init_devtree()
        powerpc/kernel/prom.c::early_init_devtree()
        openrisc/kernel/prom.c::early_init_devtree()
        sh/mm/init.c::paging_init()
        sparc/mm/init_64.c::paging_init()
        unicore32/mm/init.c::uc32_memblock_init()
      
      * In the following users, analyze was used to update total size which
        is no longer necessary.
      
        powerpc/kernel/machine_kexec.c::reserve_crashkernel()
        powerpc/kernel/prom.c::early_init_devtree()
        powerpc/mm/init_32.c::MMU_init()
        powerpc/mm/tlb_nohash.c::__early_init_mmu()  
        powerpc/platforms/ps3/mm.c::ps3_mm_add_memory()
        powerpc/platforms/embedded6xx/wii.c::wii_memory_fixups()
        sh/kernel/machine_kexec.c::reserve_crashkernel()
      
      * x86/kernel/e820.c::memblock_x86_fill() was directly setting
        memblock_can_resize before populating memblock and calling analyze
        afterwards.  Call memblock_allow_resize() before start populating.
      
      memblock_can_resize is now static inside memblock.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      1aadc056
  21. 07 12月, 2011 1 次提交
  22. 01 11月, 2011 1 次提交
    • P
      powerpc: include export.h for files using EXPORT_SYMBOL/THIS_MODULE · 93087948
      Paul Gortmaker 提交于
      Fix failures in powerpc associated with the previously allowed
      implicit module.h presence that now lead to things like this:
      
      arch/powerpc/mm/mmu_context_hash32.c:76:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
      arch/powerpc/mm/tlb_hash32.c:48:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
      arch/powerpc/kernel/pci_32.c:51:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
      arch/powerpc/kernel/iomap.c:36:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
      arch/powerpc/platforms/44x/canyonlands.c:126:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
      arch/powerpc/kvm/44x.c:168:59: error: 'THIS_MODULE' undeclared (first use in this function)
      
      [with several contibutions from Stephen Rothwell <sfr@canb.auug.org.au>]
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      93087948
  23. 12 10月, 2011 1 次提交
    • K
      powerpc/fsl-booke: Fix setup_initial_memory_limit to not blindly map · 1dc91c3e
      Kumar Gala 提交于
      On FSL Book-E devices we support multiple large TLB sizes and so we can
      get into situations in which the initial 1G TLB size is too big and
      we're asked for a size that is not mappable by a single entry (like
      512M).  The single entry is important because when we bring up secondary
      cores they need to ensure any data structure they need to access (eg
      PACA or stack) is always mapped.
      
      So we really need to determine what size will actually be mapped by the
      first TLB entry to ensure we limit early memory references to that
      region.  We refactor the map_mem_in_cams() code to provider a helper
      function that we can utilize to determine the size of the first TLB
      entry while taking into account size and alignment constraints.
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      1dc91c3e
  24. 20 9月, 2011 1 次提交
    • B
      powerpc: Hugetlb for BookE · 41151e77
      Becky Bruce 提交于
      Enable hugepages on Freescale BookE processors.  This allows the kernel to
      use huge TLB entries to map pages, which can greatly reduce the number of
      TLB misses and the amount of TLB thrashing experienced by applications with
      large memory footprints.  Care should be taken when using this on FSL
      processors, as the number of large TLB entries supported by the core is low
      (16-64) on current processors.
      
      The supported set of hugepage sizes include 4m, 16m, 64m, 256m, and 1g.
      Page sizes larger than the max zone size are called "gigantic" pages and
      must be allocated on the command line (and cannot be deallocated).
      
      This is currently only fully implemented for Freescale 32-bit BookE
      processors, but there is some infrastructure in the code for
      64-bit BooKE.
      Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      41151e77
  25. 12 7月, 2011 1 次提交
  26. 08 7月, 2011 1 次提交
  27. 29 6月, 2011 1 次提交
    • S
      powerpc/book3e-64: use a separate TLB handler when linear map is bolted · f67f4ef5
      Scott Wood 提交于
      On MMUs such as FSL where we can guarantee the entire linear mapping is
      bolted, we don't need to worry about linear TLB misses.  If on top of
      that we do a full table walk, we get rid of all recursive TLB faults, and
      can dispense with some state saving.  This gains a few percent on
      TLB-miss-heavy workloads, and around 50% on a benchmark that had a high
      rate of virtual page table faults under the normal handler.
      
      While touching the EX_TLB layout, remove EX_TLB_MMUCR0, EX_TLB_SRR0, and
      EX_TLB_SRR1 as they're not used.
      
      [BenH: Fixed build with 64K pages (wsp config)]
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f67f4ef5
  28. 17 6月, 2011 1 次提交
  29. 25 5月, 2011 2 次提交
    • P
      mm, powerpc: move the RCU page-table freeing into generic code · 26723911
      Peter Zijlstra 提交于
      In case other architectures require RCU freed page-tables to implement
      gup_fast() and software filled hashes and similar things, provide the
      means to do so by moving the logic into generic code.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Requested-by: NDavid Miller <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26723911
    • P
      powerpc: mmu_gather rework · d6bf29b4
      Peter Zijlstra 提交于
      Fix up powerpc to the new mmu_gather stuff.
      
      PPC has an extra batching queue to RCU free the actual pagetable
      allocations, use the ARCH extentions for that for now.
      
      For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the
      hardware hash-table, keep using per-cpu arrays but flush on context switch
      and use a TLF bit to track the lazy_mmu state.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d6bf29b4
  30. 18 11月, 2010 1 次提交
  31. 14 10月, 2010 2 次提交
    • K
      powerpc/fsl-booke64: Use TLB CAMs to cover linear mapping on FSL 64-bit chips · 55fd766b
      Kumar Gala 提交于
      On Freescale parts typically have TLB array for large mappings that we can
      bolt the linear mapping into.  We utilize the code that already exists
      on PPC32 on the 64-bit side to setup the linear mapping to be cover by
      bolted TLB entries.  We utilize a quarter of the variable size TLB array
      for this purpose.
      
      Additionally, we limit the amount of memory to what we can cover via
      bolted entries so we don't get secondary faults in the TLB miss
      handlers.  We should fix this limitation in the future.
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      55fd766b
    • K
      powerpc/fsl-booke: Add support for FSL Arch v1.0 MMU in setup_page_sizes · 988cf86d
      Kumar Gala 提交于
      Update setup_page_sizes() to support for a MMU v1.0 FSL style MMU
      implementation.  In such a processor, we don't have TLB0PS or EPTCFG
      registers (and access to these registers may cause exceptions).  We need
      to parse the older format of TLBnCFG for page size support.  Additionaly,
      assume since we are an FSL implementation that we have 2 TLB arrays and
      the second array contains the variable size pages.
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      988cf86d
  32. 05 8月, 2010 2 次提交
    • B
      memblock: Remove rmo_size, burry it in arch/powerpc where it belongs · cd3db0c4
      Benjamin Herrenschmidt 提交于
      The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact
      server ppc64 though I hijack it on embedded ppc64 for similar purposes)
      and represents the area of memory that can be accessed in real mode
      (aka with MMU off), or on embedded, from the exception vectors (which
      is bolted in the TLB) which pretty much boils down to the same thing.
      
      We take that out of the generic MEMBLOCK data structure and move it into
      arch/powerpc where it belongs, renaming it to "RMA" while at it.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cd3db0c4
    • B
      memblock: Introduce default allocation limit and use it to replace explicit ones · e63075a3
      Benjamin Herrenschmidt 提交于
      This introduce memblock.current_limit which is used to limit allocations
      from memblock_alloc() or memblock_alloc_base(..., MEMBLOCK_ALLOC_ACCESSIBLE).
      
      The old MEMBLOCK_ALLOC_ANYWHERE changes value from 0 to ~(u64)0 and can still
      be used with memblock_alloc_base() to allocate really anywhere.
      
      It is -no-longer- cropped to MEMBLOCK_REAL_LIMIT which disappears.
      
      Note to archs: I'm leaving the default limit to MEMBLOCK_ALLOC_ANYWHERE. I
      strongly recommend that you ensure that you set an appropriate limit
      during boot in order to guarantee that an memblock_alloc() at any time
      results in something that is accessible with a simple __va().
      
      The reason is that a subsequent patch will introduce the ability for
      the array to resize itself by reallocating itself. The MEMBLOCK core will
      honor the current limit when performing those allocations.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e63075a3
  33. 14 7月, 2010 3 次提交
  34. 19 2月, 2010 1 次提交
  35. 20 8月, 2009 1 次提交