1. 19 3月, 2010 1 次提交
  2. 21 2月, 2010 1 次提交
    • R
      MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself · 4b3073e1
      Russell King 提交于
      On VIVT ARM, when we have multiple shared mappings of the same file
      in the same MM, we need to ensure that we have coherency across all
      copies.  We do this via make_coherent() by making the pages
      uncacheable.
      
      This used to work fine, until we allowed highmem with highpte - we
      now have a page table which is mapped as required, and is not available
      for modification via update_mmu_cache().
      
      Ralf Beache suggested getting rid of the PTE value passed to
      update_mmu_cache():
      
        On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
        to construct a pointer to the pte again.  Passing a pte_t * is much
        more elegant.  Maybe we might even replace the pte argument with the
        pte_t?
      
      Ben Herrenschmidt would also like the pte pointer for PowerPC:
      
        Passing the ptep in there is exactly what I want.  I want that
        -instead- of the PTE value, because I have issue on some ppc cases,
        for I$/D$ coherency, where set_pte_at() may decide to mask out the
        _PAGE_EXEC.
      
      So, pass in the mapped page table pointer into update_mmu_cache(), and
      remove the PTE value, updating all implementations and call sites to
      suit.
      
      Includes a fix from Stephen Rothwell:
      
        sparc: fix fallout from update_mmu_cache API change
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      4b3073e1
  3. 30 10月, 2009 1 次提交
    • D
      powerpc/mm: Bring hugepage PTE accessor functions back into sync with normal accessors · 0895ecda
      David Gibson 提交于
      The hugepage arch code provides a number of hook functions/macros
      which mirror the functionality of various normal page pte access
      functions.  Various changes in the normal page accessors (in
      particular BenH's recent changes to the handling of lazy icache
      flushing and PAGE_EXEC) have caused the hugepage versions to get out
      of sync with the originals.  In some cases, this is a bug, at least on
      some MMU types.
      
      One of the reasons that some hooks were not identical to the normal
      page versions, is that the fact we're dealing with a hugepage needed
      to be passed down do use the correct dcache-icache flush function.
      This patch makes the main flush_dcache_icache_page() function hugepage
      aware (by checking for the PageCompound flag).  That in turn means we
      can make set_huge_pte_at() just a call to set_pte_at() bringing it
      back into sync.  As a bonus, this lets us remove the
      hash_huge_page_do_lazy_icache() function, replacing it with a call to
      the hash_page_do_lazy_icache() function it was based on.
      
      Some other hugepage pte access hooks - huge_ptep_get_and_clear() and
      huge_ptep_clear_flush() - are not so easily unified, but this patch at
      least brings them back into sync with the current versions of the
      corresponding normal page functions.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0895ecda
  4. 23 9月, 2009 1 次提交
    • K
      walk system ram range · 908eedc6
      KAMEZAWA Hiroyuki 提交于
      Originally, walk_memory_resource() was introduced to traverse all memory
      of "System RAM" for detecting memory hotplug/unplug range.  For doing so,
      flags of IORESOUCE_MEM|IORESOURCE_BUSY was used and this was enough for
      memory hotplug.
      
      But for using other purpose, /proc/kcore, this may includes some firmware
      area marked as IORESOURCE_BUSY | IORESOUCE_MEM.  This patch makes the
      check strict to find out busy "System RAM".
      
      Note: PPC64 keeps their own walk_memory_resouce(), which walk through
      ppc64's lmb informaton.  Because old kclist_add() is called per lmb, this
      patch makes no difference in behavior, finally.
      
      And this patch removes CONFIG_MEMORY_HOTPLUG check from this function.
      Because pfn_valid() just show "there is memmap or not* and cannot be used
      for "there is physical memory or not", this function is useful in generic
      to scan physical memory range.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: WANG Cong <xiyou.wangcong@gmail.com>
      Cc: Américo Wang <xiyou.wangcong@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      908eedc6
  5. 22 9月, 2009 1 次提交
  6. 27 5月, 2009 2 次提交
    • B
      powerpc: Fix up dma_alloc_coherent() on platforms without cache coherency. · 8b31e49d
      Benjamin Herrenschmidt 提交于
      The implementation we just revived has issues, such as using a
      Kconfig-defined virtual address area in kernel space that nothing
      actually carves out (and thus will overlap whatever is there),
      or having some dependencies on being self contained in a single
      PTE page which adds unnecessary constraints on the kernel virtual
      address space.
      
      This fixes it by using more classic PTE accessors and automatically
      locating the area for consistent memory, carving an appropriate hole
      in the kernel virtual address space, leaving only the size of that
      area as a Kconfig option. It also brings some dma-mask related fixes
      from the ARM implementation which was almost identical initially but
      grew its own fixes.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8b31e49d
    • B
      powerpc: Minor cleanups of kernel virt address space definitions · f637a49e
      Benjamin Herrenschmidt 提交于
      Make FIXADDR_TOP a compile time constant and cleanup a
      couple of definitions relative to the layout of the kernel
      address space on ppc32. We also print out that layout at
      boot time for debugging purposes.
      
      This is a pre-requisite for properly fixing non-coherent
      DMA allocactions.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f637a49e
  7. 15 5月, 2009 1 次提交
  8. 11 2月, 2009 1 次提交
    • B
      powerpc/mm: Rework I$/D$ coherency (v3) · 8d30c14c
      Benjamin Herrenschmidt 提交于
      This patch reworks the way we do I and D cache coherency on PowerPC.
      
      The "old" way was split in 3 different parts depending on the processor type:
      
         - Hash with per-page exec support (64-bit and >= POWER4 only) does it
      at hashing time, by preventing exec on unclean pages and cleaning pages
      on exec faults.
      
         - Everything without per-page exec support (32-bit hash, 8xx, and
      64-bit < POWER4) does it for all page going to user space in update_mmu_cache().
      
         - Embedded with per-page exec support does it from do_page_fault() on
      exec faults, in a way similar to what the hash code does.
      
      That leads to confusion, and bugs. For example, the method using update_mmu_cache()
      is racy on SMP where another processor can see the new PTE and hash it in before
      we have cleaned the cache, and then blow trying to execute. This is hard to hit but
      I think it has bitten us in the past.
      
      Also, it's inefficient for embedded where we always end up having to do at least
      one more page fault.
      
      This reworks the whole thing by moving the cache sync into two main call sites,
      though we keep different behaviours depending on the HW capability. The call
      sites are set_pte_at() which is now made out of line, and ptep_set_access_flags()
      which joins the former in pgtable.c
      
      The base idea for Embedded with per-page exec support, is that we now do the
      flush at set_pte_at() time when coming from an exec fault, which allows us
      to avoid the double fault problem completely (we can even improve the situation
      more by implementing TLB preload in update_mmu_cache() but that's for later).
      
      If for some reason we didn't do it there and we try to execute, we'll hit
      the page fault, which will do a minor fault, which will hit ptep_set_access_flags()
      to do things like update _PAGE_ACCESSED or _PAGE_DIRTY if needed, we just make
      this guys also perform the I/D cache sync for exec faults now. This second path
      is the catch all for things that weren't cleaned at set_pte_at() time.
      
      For cpus without per-pag exec support, we always do the sync at set_pte_at(),
      thus guaranteeing that when the PTE is visible to other processors, the cache
      is clean.
      
      For the 64-bit hash with per-page exec support case, we keep the old mechanism
      for now. I'll look into changing it later, once I've reworked a bit how we
      use _PAGE_EXEC.
      
      This is also a first step for adding _PAGE_EXEC support for embedded platforms
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d30c14c
  9. 07 1月, 2009 1 次提交
    • G
      mm: show node to memory section relationship with symlinks in sysfs · c04fc586
      Gary Hade 提交于
      Show node to memory section relationship with symlinks in sysfs
      
      Add /sys/devices/system/node/nodeX/memoryY symlinks for all
      the memory sections located on nodeX.  For example:
      /sys/devices/system/node/node1/memory135 -> ../../memory/memory135
      indicates that memory section 135 resides on node1.
      
      Also revises documentation to cover this change as well as updating
      Documentation/ABI/testing/sysfs-devices-memory to include descriptions
      of memory hotremove files 'phys_device', 'phys_index', and 'state'
      that were previously not described there.
      
      In addition to it always being a good policy to provide users with
      the maximum possible amount of physical location information for
      resources that can be hot-added and/or hot-removed, the following
      are some (but likely not all) of the user benefits provided by
      this change.
      Immediate:
        - Provides information needed to determine the specific node
          on which a defective DIMM is located.  This will reduce system
          downtime when the node or defective DIMM is swapped out.
        - Prevents unintended onlining of a memory section that was
          previously offlined due to a defective DIMM.  This could happen
          during node hot-add when the user or node hot-add assist script
          onlines _all_ offlined sections due to user or script inability
          to identify the specific memory sections located on the hot-added
          node.  The consequences of reintroducing the defective memory
          could be ugly.
        - Provides information needed to vary the amount and distribution
          of memory on specific nodes for testing or debugging purposes.
      Future:
        - Will provide information needed to identify the memory
          sections that need to be offlined prior to physical removal
          of a specific node.
      
      Symlink creation during boot was tested on 2-node x86_64, 2-node
      ppc64, and 2-node ia64 systems.  Symlink creation during physical
      memory hot-add tested on a 2-node x86_64 system.
      Signed-off-by: NGary Hade <garyhade@us.ibm.com>
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c04fc586
  10. 21 12月, 2008 2 次提交
    • B
      powerpc/mm: Rework usage of _PAGE_COHERENT/NO_CACHE/GUARDED · 64b3d0e8
      Benjamin Herrenschmidt 提交于
      Currently, we never set _PAGE_COHERENT in the PTEs, we just OR it in
      in the hash code based on some CPU feature bit.  We also manipulate
      _PAGE_NO_CACHE and _PAGE_GUARDED by hand in all sorts of places.
      
      This changes the logic so that instead, the PTE now contains
      _PAGE_COHERENT for all normal RAM pages thay have I = 0 on platforms
      that need it.  The hash code clears it if the feature bit is not set.
      
      It also adds some clean accessors to setup various valid combinations
      of access flags and change various bits of code to use them instead.
      
      This should help having the PTE actually containing the bit
      combinations that we really want.
      
      I also removed _PAGE_GUARDED from _PAGE_BASE on 44x and instead
      set it explicitely from the TLB miss.  I will ultimately remove it
      completely as it appears that it might not be needed after all
      but in the meantime, having it in the TLB miss makes things a
      lot easier.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NKumar Gala <galak@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      64b3d0e8
    • B
      powerpc/mm: Add SMP support to no-hash TLB handling · f048aace
      Benjamin Herrenschmidt 提交于
      This commit moves the whole no-hash TLB handling out of line into a
      new tlb_nohash.c file, and implements some basic SMP support using
      IPIs and/or broadcast tlbivax instructions.
      
      Note that I'm using local invalidations for D->I cache coherency.
      
      At worst, if another processor is trying to execute the same and
      has the old entry in its TLB, it will just take a fault and re-do
      the TLB flush locally (it won't re-do the cache flush in any case).
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NKumar Gala <galak@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f048aace
  11. 20 10月, 2008 1 次提交
    • B
      mm: cleanup to make remove_memory() arch-neutral · 71088785
      Badari Pulavarty 提交于
      There is nothing architecture specific about remove_memory().
      remove_memory() function is common for all architectures which support
      hotplug memory remove.  Instead of duplicating it in every architecture,
      collapse them into arch neutral function.
      
      [akpm@linux-foundation.org: fix the export]
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Gary Hade <garyhade@us.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      71088785
  12. 07 10月, 2008 1 次提交
    • R
      powerpc: Avoid integer overflow in page_is_ram() · a880e762
      Roland Dreier 提交于
      Commit 8b150478 ("ppc: make phys_mem_access_prot() work with pfns
      instead of addresses") fixed page_is_ram() in arch/ppc to avoid overflow
      for addresses above 4G on 32-bit kernels.  However arch/powerpc's
      page_is_ram() is missing the same fix -- it computes a physical address
      by doing pfn << PAGE_SHIFT, which overflows if pfn corresponds to a page
      above 4G.
      
      In particular this causes pages above 4G to be mapped with the wrong
      caching attribute; for example many ppc440-based SoCs have PCI space
      above 4G, and mmap()ing MMIO space may end up with a mapping that has
      caching enabled.
      
      Fix this by working with the pfn and avoiding the conversion to
      physical address that causes the overflow.  This patch compares the
      pfn to max_pfn, which is a semantic change from the old code -- that
      code compared the physical address to high_memory, which corresponds
      to max_low_pfn.  However, I think that was is another bug, since
      highmem pages are still RAM.
      Reported-by: Nvb <vb@vsbe.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a880e762
  13. 04 8月, 2008 1 次提交
  14. 27 7月, 2008 1 次提交
  15. 10 7月, 2008 1 次提交
  16. 03 7月, 2008 1 次提交
  17. 09 6月, 2008 1 次提交
  18. 29 4月, 2008 1 次提交
  19. 28 4月, 2008 1 次提交
  20. 24 4月, 2008 2 次提交
    • K
      [POWERPC] Port fixmap from x86 and use for kmap_atomic · 2c419bde
      Kumar Gala 提交于
      The fixmap code from x86 allows us to have compile time virtual addresses
      that we change the physical addresses of at run time.
      
      This is useful for applications like kmap_atomic, PCI config that is done
      via direct memory map, kexec/kdump.
      
      We got ride of CONFIG_HIGHMEM_START as we can now determine a more optimal
      location for PKMAP_BASE based on where the fixmap addresses start and
      working back from there.
      
      Additionally, the kmap code in asm-powerpc/highmem.h always had debug
      enabled.  Moved to using CONFIG_DEBUG_HIGHMEM to determine if we should
      have the extra debug checking.
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      2c419bde
    • K
      [POWERPC] 85xx: Add support for relocatable kernel (and booting at non-zero) · 37dd2bad
      Kumar Gala 提交于
      Added support to allow an 85xx kernel to be run from a non-zero physical
      address (useful for cooperative asymmetric multiprocessing situations and
      kdump).  The support can be configured at compile time by setting
      CONFIG_PAGE_OFFSET, CONFIG_KERNEL_START, and CONFIG_PHYSICAL_START as
      desired.
      
      Alternatively, the kernel build can set CONFIG_RELOCATABLE.  Setting this
      config option causes the kernel to determine at runtime the physical
      addresses of CONFIG_PAGE_OFFSET and CONFIG_KERNEL_START.  If
      CONFIG_RELOCATABLE is set, then CONFIG_PHYSICAL_START has no meaning.
      However, CONFIG_PHYSICAL_START will always be used to set the LOAD program
      header physical address field in the resulting ELF image.
      
      Currently we are limited to running at a physical address that is a
      multiple of 256M.  This is due to how we map TLBs to cover
      lowmem.  This should be fixed to allow 64M or maybe even 16M alignment
      in the future.  It is considered an error to try and run a kernel at a
      non-aligned physical address.
      
      All the magic for this support is accomplished by proper initialization
      of the kernel memory subsystem and use of ARCH_PFN_OFFSET.
      
      The use of ARCH_PFN_OFFSET only affects normal memory and not IO mappings.
      ioremap uses map_page and isn't affected by ARCH_PFN_OFFSET.
      
      /dev/mem continues to allow access to any physical address in the system
      regardless of how CONFIG_PHYSICAL_START is set.
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      37dd2bad
  21. 17 4月, 2008 1 次提交
  22. 01 4月, 2008 2 次提交
  23. 14 2月, 2008 1 次提交
  24. 08 2月, 2008 3 次提交
  25. 06 2月, 2008 1 次提交
  26. 24 1月, 2008 1 次提交
    • K
      [POWERPC] Fix handling of memreserve if the range lands in highmem · f98eeb4e
      Kumar Gala 提交于
      There were several issues if a memreserve range existed and happened
      to be in highmem:
      
      * The bootmem allocator is only aware of lowmem so calling
        reserve_bootmem with a highmem address would cause a BUG_ON
      * All highmem pages were provided to the buddy allocator
      
      Added a lmb_is_reserved() api that we now use to determine if a highem
      page should continue to be PageReserved or provided to the buddy
      allocator.
      
      Also, we incorrectly reported the amount of pages reserved since all
      highmem pages are initally marked reserved and we clear the
      PageReserved flag as we "free" up the highmem pages.
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      f98eeb4e
  27. 20 11月, 2007 1 次提交
  28. 17 10月, 2007 1 次提交
  29. 17 8月, 2007 1 次提交
  30. 10 7月, 2007 1 次提交
  31. 14 6月, 2007 1 次提交
  32. 22 5月, 2007 1 次提交
  33. 09 5月, 2007 1 次提交
  34. 02 5月, 2007 1 次提交